PDA

View Full Version : Question for Unix programmers...


Cumber
09-01-2005, 13:37:45
(Linux programmers can answer too, provided the fact that you guys have clone as well as fork doesn't make things too muddy)

How often do you use (or have you seen used) the fork system call where the point was to actually obtain a copy of the forking process, rather than just to make a new process that could do a bit of set up work and then use exec to throw away 99.99999% of the work done by fork?

Sir Penguin
09-01-2005, 22:29:31
Only in servers. And viruses. We wrote a virus in my OS class last semester that brought down the server. Unfortunately, it was supposed to be a shell.

SP

Darkstar
10-01-2005, 08:30:16
Way to go, SP! :D

Deacon
11-01-2005, 02:19:00
I dunno. IIRC, fork used to immediately do all the work of copying. But at some point, somebody changed fork so that the copy is put off until there's actually a need to do so. If you're only reading, then you can look at the parent's memory. If you're writing, then the copy is done and the write goes to the child's memory.

Cumber
11-01-2005, 14:36:29
Oh yes. Copy-on-write. Unix marks all of the parent process's pages (page = memory block) as read only, gives the child process a page table that points to all the same pages and lets them go. But it secretly remembers which pages were supposed to be read/write. If either of the processes ever writes, it causes a protection fault. Unix checks to see if it's really a protection fault, or a copy-on-write fault. If the latter, it copies the page, sets both process' page table entries for that page to point to the separate copies, and restarts the faulting process (this can be complicated to extend it to more than one child process [and children of child processes]) but that's basically it.

Orders of magnitude better than hard forking. However, if the child process only does a teeny tiny bit of work and then execs you're still left with the parent with all its pages marked as read only, leaving the OS with two options; it can either scan through the parent's entire page table (again) setting the proper pages to read/write, or it can let the parent process fault on every page it needs to write. To my mind, this still sucks a bit. It would be nice if the cost of getting a process up and running was pretty much only dependent on the size of the process being created, rather than on the size of the parent process. I was curious as to whether a method for process creation that avoided most of this usually-unnecessary work and could do everything short of a fork where a real copy was actually wanted would cause many problems. The next question is of course how much would really be gained by improving the efficiency of process creation (not that much, I believe, depending on the application at hand, but it's partly the principle of the thing :)).

Deacon
12-01-2005, 02:01:45
I bet Torvalds knows. :)

elfwine
31-01-2005, 12:57:15
Greetings. I have only recently found this discussion board and I look forward to be an active part of this communitty.

Now to business:

Well, If been looking around but I still need some help here.

Can anyone explain to me how does the process copy-on-writte occurs? and when? And what advantages does it has?

Thanks in advance ;)