Thursday, March 23, 2006

app_server Memory Management Revisited

I recently looked into why BeIDE's interface did only have green squares where its icons should have been (bug #313). The function importing the client's bitmap data did not work correctly, and while playing with it, the app_server suddenly crashed, and continued to do so in a reproducible way.

How was this possible? Bitmaps are located in a shared memory pool between the app_server, and an application. Unfortunately, the app_server put those bitmaps into arbitrary larger areas, and put the structures managing that space into those areas as well - like a userland memory allocator would do. However, if a client clobbered memory outside of its space in those areas (and that's what buggy clients do all the time), the structures could be easily broken, which caused the app_server to crash when it tried to use them next time. Also, since all applications shared the same area, they could easily clobber bitmaps of each other, as well.

But there even were more disadvantages of the way client memory was managed: the client would clone that area for each bitmap therein - that meant for an application like Tracker with potentially tons of icons (which are bitmaps), that it wasted huge amounts of address space: if the area was 1 MB large and contained 500 icons, Tracker would have cloned that area 500 times, once for each icon, wasting 500 MB of address space. If you have a folder with many image thumbnails, the maximum limit (2 GB per application) could have been reached with ease. Not a very good idea.

Another problem of the previous solution was memory fragmentation and contention - if many applications were allocating server memory at the same time, their memory would have been spread out over the available areas, and since it was only a single resource, all applications needed to reserve their memory one after the other, for every single allocation. If now one of these applications were quit, its memory had to be freed again, and left holes in the area. Of course, the app_server needed to create quite a few areas - and with memory fragmentation like this, would waste much more memory and address space, which is a real concern in the app_server.

Anyway, the new solution works pretty much different: the app_server now tries to have a single area per application - if that application dies, that area can be freed instantly, without having to worry about other applications. To achieve this, the client reserves a certain area for the app_server - that makes sure that the area can be resized if required - at the server's side, the area is always exactly as large as needed. Since the app_server doesn't reserve space for the client, it comes up with fully relocatable memory; if an area cannot be resized in the app_server (since there are other areas in its way), it can be relocated to another address where it fits. If that's not possible, a new area is created, and the client is triggered to clone it. Of course, every area is now only cloned once in the client, too.

The structures that manage the allocations and free space in these areas are now separated from the memory itself, and not reachable by the client, with the desired effect that the app_server cannot be crashed that easily anymore that way. The contention is reduced to the requirements of a single application which should be much more adequate.

As an additional bonus, the new solution should be much faster due to the wastly reduced amount of area creations and clones. The allocator itself is pretty simple, though, and could probably be improved further, however it works pretty nice so far.

Saturday, February 04, 2006

APM Support

Since a few days, we have a working APM driver in our kernel. APM stands for Advanced Power Management. It's a service as part of the computer's firmware commonly called BIOS in the x86 world. The latest APM standard, version 1.2, is already almost 10 years old. Today's computers do still support it, even though the preferred method to have similar services (among others) is now ACPI, or Advanced Configuration and Power Interface. Thanks to Nathan Whitehorn effort and Intel's example implementation, we even have the beginnings of ACPI support in Haiku as well.

But let's go back to APM. Theoretically, it can be used to put your system in one of several power states, like suspend or power off. You may also read out battery information from your laptop as the estimated remaining power. It also supports throttling the CPU for some laptops, but it only differentiates between full speed and slower speed.

The driver doesn't do much yet, but it should let you shutdown your computer. In addition to that, it follows the standard and periodically polls for APM events. An example APM event would happen when you connect the AC adapter to your laptop.

By default, the driver is currently disabled, but that might change when I have a better picture about on which hardware it doesn't run yet. I have successfully tested in on 4 different systems over here, but I also have one negative report.

If you're interested to test Haiku's APM support yourself, you can add the line "apm true" to your kernel settings file. When you then enter "shutdown -q" in the Terminal, the system should be turned off. If an error comes back, APM couldn't be enabled for some reason. If nothing happens, your computer's APM implementation is probably not that good. In some rare cases, your computer may refuse to boot with APM enabled - in this case, you can disable APM in the safemode settings in the boot loader. If it really doesn't work, I would be very interested in the serial debug output in case you can retrieve it.

In other news, we now also have syslog support in the kernel, as well a on screen debug output during boot. The former can be enabled in the kernel settings file "syslog_debug_output true", while the latter can be enabled in the safemode settings of the boot loader. "syslog" is a system logging service that currently stores its output file under /var/log/syslog. Note that you must shutdown the system gracefully to make sure the log could be written to disk.

Sunday, January 15, 2006

Sorry, Volume Is Busy!

If you've used BeOS, you're probably familiar with the above message when trying to unmount a volume. From time to time, some application keeps accessing a volume, and you can't determine which application that is. It might be caused by a running live query, but it might also be caused by buggy background applications that forget to close a file.

I've just given you control over your volumes back again in Haiku: you can force unmounting such a volume -- applications still trying to access it, would get an error back. Forcing an unmount requires an extra user interaction, though, so it's not the preferred solution.

To remove one of the problems, live queries shouldn't bother unmounting a volume at all: it doesn't make any sense that they are preventing the normal unmounting process to stop. This can hardly be in the interest of an application that is querying for something.

On the other side, we should try to improve the user perception of a busy volume: instead of saying "sorry, busy" it should say something like: "Sorry, application Tracker is still accessing the volume." - for the user this makes an important difference, especially when he now has the power to force unmounting a volume, it gives him the information he needs to properly decide what he really wants to do.

As a side effect, we'd get a tool that can determine which applications have which files open - to be able to report misbehaviour of the application back to its developers. Or even better, to give the developer the possibility to monitor the performance of his application.

Well, at least you have the power now, control comes next :-)

Tuesday, January 03, 2006

And Thanks For All The Fish

My official employment at Haiku has ended now. I wanted to thank you for all the donations that made this possible. In retrospect, it were pretty busy months for Haiku, I think I have committed over 600 changes during that time, lots of minor ones, of course, but also a few bigger ones.

In case Haiku runs on your system, you should now be at least able to experience uptimes of several hours, depending on what you do, of course :-)

Not that I want to take over the whole responsibility for these changes, there are still a lot of volunteers working on Haiku - including myself now again. If you want to give another developer the chance to work on Haiku full time, you know what you can do about it.

I'll continue this blog with my Haiku development insights, although there will happen a little less than in the last weeks.

Thursday, December 29, 2005

Bug Hunting

Instead of completing the paging implementation, I got distracted with a couple of crashing bugs that showed up while testing Haiku, and they kept me busy for the last few days. Strangely enough, I could reproduce each bug easily only on different systems.

At least, I could now run Haiku with BitmapDrawing and Pulse in the background for over an hour (after which I shut it down myself). While playing around with it, I found some weird behaviour with the Backgrounds application which I am currently working on. If those problems can be fixed quickly, I will have a look at enabling Deskbar add-ons and replicants under Haiku.

After this hiatus, I will continue to work on the virtual memory manager. I hope that I'll can finish the bigger part of the work this year, so that I can complete it easily after my official employment period ends (only 2 days are left!).

Monday, December 19, 2005

Back To The Kernel

It's not that the app_server is ready and polished or anything close - but it's in an acceptable state. For now, my main focus is back in the kernel, although I'll come back to the app_server from time to time in the next days and weeks.

I am currently looking into getting paging support for Haiku. That's the feature you know by the term "virtual memory" or "swapping". Plain and simple it makes Haiku support more memory than you have installed in your computer. When the RAM is full Haiku will utilize the hard disk as an additional backing store.

But why would I start working on this before Haiku even runs stable? One reason is indeed to increase the system stability: currently, it's almost impossible to let the system run out of memory. Am I contradicting myself here? It looks like I do, granted, but let me try to explain what I mean. No, you don't have to know. If you like, you can just skip to the next paragraph. Yeah, you don't have to read that one either if you don't like :-))

Anyway, when an area of memory is allocated, the memory is not really taken from the system memory - it's just reserved. Only the memory you are really using is actually taken from the system page pool (where one page is an architecture dependent amount of memory, usually it's 4096 bytes), we call this "committing memory". Especially with binaries, the areas created for them are usually much larger than what finally makes its way into the memory, certain functionality that you don't use of your web browser or debug info aren't loaded into memory to safe space and time.
So theoretically, we could always promise more memory than we can actually deliver. Just think about the main stack area in BeOS: it's a 16 MB area per application. You don't need that many fingers to figure out how many applications you could run if the system would be entirely honest with you (well, at least a few years ago you would have been successful doing so). So yes, it's lying. If every application would actually need its whole stack, the system would need to stop them before memory becomes really tight. This technique is known as "overcommitting" - the system pretends to have what you might need, because it assumes that you won't need it.
Therefore, it shouldn't lie to you that often, it should choose these occasions wisely. Haiku only uses this for stacks. For everything else it makes sure it can deliver the memory it had previously promised to you. This can result in "out of memory" situations even though there are plenty of free pages left - the problem with those pages is that they are promised to someone else. They may still get used for system caches and the like, but they are unavailable to applications.

And that's where the swap file comes in: having some extra space in there, the system can promise much more memory, and thus, can actually use up those pages for real, it can really have no free pages left. In other words, to run out of memory (to be able to test the kernel in these situations) it needs to at least think it has a swap file.

The Haiku swap file implementation will be anything but spectacular, but it'll hopefully work good enough for our target audience - the usual desktop user don't have that heavy requirements there. On the other hand, it will probably work better than the one in BeOS - at least I can hardly imagine how it could run that badly :-)

Wednesday, December 14, 2005

MTRR

Sure, you too! Since Stephan made a BDirectWindow based version of our app_server that directly uses the hardware frame buffer and acceleration features, we noticed that it felt much faster there than on real hardware. How could that be?

The reason is actually very simple. Parts of our rendering pipeline like text output isn't optimized to use 32/64-bit memory access - that means it doesn't make full use of the memory bus. While we'd like to change this for the future, Intel introduced a functionality called write-combining in something like 1998 that is supposed to optimize write access to something like a frame buffer. Instead of directly writing the bytes back to the buffer instantly, the CPU waits until you have written 32 sequential bytes, and then writes them back at once, in a single burst. Enabling write-combining is therefore even a good idea if you already have a optimized your graphics output, although the effect is less noticeable in that case.

This brings us back to MTRR, "memory type range register", just in case you asked whatever that would be :-) Using them, you can specify that the CPU should access a part of memory in a specific way - like write-combining, but there are other options, too. In BeOS and Haiku, they can only be specified in the map_physical_memory() call (via the B_MTR_* flags). Graphics drivers usually try to map their frame buffer in write-combining mode, at least all of ours do that, so they are directly benefiting from the new functionality.

MTRR is a CPU dependent feature that is programmed using the machine state registers. Luckily Intel and AMD uses the exact same mechanism here, and thus, we support it for processors of both vendors. We'll make sure that it is supported for other brands like VIA or Transmeta as well.

Even though the app_server has lots of potential optimizations left, it already feels pretty well now. You can still manage to lock it up, but those problems should go away soon.