diff options
author | Thomas Schwinge <tschwinge@gnu.org> | 2012-11-29 01:33:22 +0100 |
---|---|---|
committer | Thomas Schwinge <tschwinge@gnu.org> | 2012-11-29 01:33:22 +0100 |
commit | 5bd36fdff16871eb7d06fc26cac07e7f2703432b (patch) | |
tree | b430970a01dfc56b8d41979552999984be5c6dfd /open_issues/libpthread.mdwn | |
parent | 2603401fa1f899a8ff60ec6a134d5bd511073a9d (diff) | |
download | web-5bd36fdff16871eb7d06fc26cac07e7f2703432b.tar.gz web-5bd36fdff16871eb7d06fc26cac07e7f2703432b.tar.bz2 web-5bd36fdff16871eb7d06fc26cac07e7f2703432b.zip |
IRC.
Diffstat (limited to 'open_issues/libpthread.mdwn')
-rw-r--r-- | open_issues/libpthread.mdwn | 668 |
1 files changed, 668 insertions, 0 deletions
diff --git a/open_issues/libpthread.mdwn b/open_issues/libpthread.mdwn index 03a52218..81f1a382 100644 --- a/open_issues/libpthread.mdwn +++ b/open_issues/libpthread.mdwn @@ -566,3 +566,671 @@ There is a [[!FF_project 275]][[!tag bounty]] on this task. <braunr> ouch <bddebian> braunr: Do you have debugging enabled in that custom kernel you installed? Apparently it is sitting at the debug prompt. + + +## IRC, freenode, #hurd, 2012-08-12 + + <braunr> hmm, it seems the hurd notion of cancellation is actually not the + pthread one at all + <braunr> pthread_cancel merely marks a thread as being cancelled, while + hurd_thread_cancel interrupts it + <braunr> ok, i have a pthread_hurd_cond_wait_np function in glibc + + +## IRC, freenode, #hurd, 2012-08-13 + + <braunr> nice, i got ext2fs work with pthreads + <braunr> there are issues with the stack size strongly limiting the number + of concurrent threads, but that's easy to fix + <braunr> one problem with the hurd side is the condition implications + <braunr> i think it should be deal separately, and before doing anything + with pthreads + <braunr> but that's minor, the most complex part is, again, the term server + <braunr> other than that, it was pretty easy to do + <braunr> but, i shouldn't speak too soon, who knows what tricky bootstrap + issue i'm gonna face ;p + <braunr> tschwinge: i'd like to know how i should proceed if i want a + symbol in a library overriden by that of a main executable + <braunr> e.g. have libpthread define a default stack size, and let + executables define their own if they want to change it + <braunr> tschwinge: i suppose i should create a weak alias in the library + and a normal variable in the executable, right ? + <braunr> hm i'm making this too complicated + <braunr> don't mind that stupid question + <tschwinge> braunr: A simple variable definition would do, too, I think? + <tschwinge> braunr: Anyway, I'd first like to know why we can'T reduce the + size of libpthread threads from 2 MiB to 64 KiB as libthreads had. Is + that a requirement of the pthread specification? + <braunr> tschwinge: it's a requirement yes + <braunr> the main reason i see is that hurd threadvars (which are still + present) rely on common stack sizes and alignment to work + <tschwinge> Mhm, I see. + <braunr> so for now, i'm using this approach as a hack only + <tschwinge> I'm working on phasing out threadvars, but we're not there yet. + <tschwinge> Yes, that's fine for the moment. + <braunr> tschwinge: a simple definition wouldn't work + <braunr> tschwinge: i resorted to a weak symbol, and see how it goes + <braunr> tschwinge: i supposed i need to export my symbol as a global one, + otherwise making it weak makes no sense, right ? + <braunr> suppose* + <braunr> tschwinge: also, i'm not actually sure what you meant is a + requirement about the stack size, i shouldn't have answered right away + <braunr> no there is actually no requirement + <braunr> i misunderstood your question + <braunr> hm when adding this weak variable, starting a program segfaults :( + <braunr> apparently on ___pthread_self, a tls variable + <braunr> fighting black magic begins + <braunr> arg, i can't manage to use that weak symbol to reduce stack sizes + :( + <braunr> ah yes, finally + <braunr> git clone /path/to/glibc.git on a pthread-powered ext2fs server :> + <braunr> tschwinge: seems i have problems using __thread in hurd code + <braunr> tschwinge: they produce undefined symbols + <braunr> tschwinge: forget that, another mistake on my part + <braunr> so, current state: i just need to create another patch, for the + code that is included in the debian hurd package but not in the upstream + hurd repository (e.g. procfs, netdde), and i should be able to create + hurd packages taht completely use pthreads + + +## IRC, freenode, #hurd, 2012-08-14 + + <braunr> tschwinge: i have weird bootstrap issues, as expected + <braunr> tschwinge: can you point me to important files involved during + bootstrap ? + <braunr> my ext2fs.static server refuses to start as a rootfs, whereas it + seems to work fine otherwise + <braunr> hm, it looks like it's related to global signal dispositions + + +## IRC, freenode, #hurd, 2012-08-15 + + <braunr> ahah, a subhurd running pthreads-powered hurd servers only + <LarstiQ> braunr: \o/ + <braunr> i can even long on ssh + <braunr> log + <braunr> pinotree: for reference, i uploaded my debian-specific changes + there : + <braunr> http://git.sceen.net/rbraun/debian_hurd.git/ + <braunr> darnassus is now running a pthreads-enabled hurd system :) + + +## IRC, freenode, #hurd, 2012-08-16 + + <braunr> my pthreads-enabled hurd systems can quickly die under load + <braunr> youpi: with hurd servers using pthreads, i occasionally see thread + storms apparently due to a deadlock + <braunr> youpi: it makes me think of the problem you sometimes have (and + had often with the page cache patch) + <braunr> in cthreads, mutex and condition operations are macros, and they + check the mutex/condition queue without holding the internal + mutex/condition lock + <braunr> i'm not sure where this can lead to, but it doesn't seem right + <pinotree> isn't that a bit dangerous? + <braunr> i believe it is + <braunr> i mean + <braunr> it looks dangerous + <braunr> but it may be perfectly safe + <pinotree> could it be? + <braunr> aiui, it's an optimization, e.g. "dont take the internal lock if + there are no thread to wake" + <braunr> but if there is a thread enqueuing itself at the same time, it + might not be waken + <pinotree> yeah + <braunr> pthreads don't have this issue + <braunr> and what i see looks like a deadlock + <pinotree> anything can happen between the unlocked checking and the + following instruction + <braunr> so i'm not sure how a situation working around a faulty + implementation would result in a deadlock with a correct one + <braunr> on the other hand, the error youpi reported + (http://lists.gnu.org/archive/html/bug-hurd/2012-07/msg00051.html) seems + to indicate something is deeply wrong with libports + <pinotree> it could also be the current code does not really "works around" + that, but simply implicitly relies on the so-generated behaviour + <braunr> luckily not often + <braunr> maybe + <braunr> i think we have to find and fix these issues before moving to + pthreads entirely + <braunr> (ofc, using pthreads to trigger those bugs is a good procedure) + <pinotree> indeed + <braunr> i wonder if tweaking the error checking mode of pthreads to abort + on EDEADLK is a good approach to detecting this problem + <braunr> let's try ! + <braunr> youpi: eh, i think i've spotted the libports ref mistake + <youpi> ooo! + <youpi> .oOo.!! + <gnu_srs> Same problem but different patches + <braunr> look at libports/bucket-iterate.c + <braunr> in the HURD_IHASH_ITERATE loop, pi->refcnt is incremented without + a lock + <youpi> Mmm, the incrementation itself would probably be compiled into an + INC, which is safe in UP + <youpi> it's an add currently actually + <youpi> 0x00004343 <+163>: addl $0x1,0x4(%edi) + <braunr> 40c4: 83 47 04 01 addl $0x1,0x4(%edi) + <youpi> that makes it SMP unsafe, but not UP unsafe + <braunr> right + <braunr> too bad + <youpi> that still deserves fixing :) + <braunr> the good side is my mind is already wired for smp + <youpi> well, it's actually not UP either + <youpi> in general + <youpi> when the processor is not able to do the add in one instruction + <braunr> sure + <braunr> youpi: looks like i'm wrong, refcnt is protected by the global + libports lock + <youpi> braunr: but aren't there pieces of code which manipulate the refcnt + while taking another lock than the global libports lock + <youpi> it'd not be scalable to use the global libports lock to protect + refcnt + <braunr> youpi: imo, the scalability issues are present because global + locks are taken all the time, indeed + <youpi> urgl + <braunr> yes .. + <braunr> when enabling mutex checks in libpthread, pfinet dies :/ + <braunr> grmbl, when trying to start "ls" using my deadlock-detection + libpthread, the terminal gets unresponsive, and i can't even use ps .. :( + <pinotree> braunr: one could say your deadlock detection works too + good... :P + <braunr> pinotree: no, i made a mistake :p + <braunr> it works now :) + <braunr> well, works is a bit fast + <braunr> i can't attach gdb now :( + <braunr> *sigh* + <braunr> i guess i'd better revert to a cthreads hurd and debug from there + <braunr> eh, with my deadlock-detection changes, recursive mutexes are now + failing on _pthread_self(), which for some obscure reason generates this + <braunr> => 0x0107223b <+283>: jmp 0x107223b + <__pthread_mutex_timedlock_internal+283> + <braunr> *sigh* + + +## IRC, freenode, #hurd, 2012-08-17 + + <braunr> aw, the thread storm i see isn't a deadlock + <braunr> seems to be mere contention .... + <braunr> youpi: what do you think of the way + ports_manage_port_operations_multithread determines it needs to spawn a + new thread ? + <braunr> it grabs a lock protecting the number of threads to determine if + it needs a new thread + <braunr> then releases it, to retake it right after if a new thread must be + created + <braunr> aiui, it could lead to a situation where many threads could + determine they need to create threads + <youpi> braunr: there's no reason to release the spinlock before re-taking + it + <youpi> that can indeed lead to too much thread creations + <braunr> youpi: a harder question + <braunr> youpi: what if thread creation fails ? :/ + <braunr> if i'm right, hurd servers simply never expect thread creation to + fail + <youpi> indeed + <braunr> and as some patterns have threads blocking until another produce + an event + <braunr> i'm not sure there is any point handling the failure at all :/ + <youpi> well, at least produce some output + <braunr> i added a perror + <youpi> so we know that happened + <braunr> async messaging is quite evil actually + <braunr> the bug i sometimes have with pfinet is usually triggered by + fakeroot + <braunr> it seems to use select a lot + <braunr> and select often destroys ports when it has something to return to + the caller + <braunr> which creates dead name notifications + <braunr> and if done often enough, a lot of them + <youpi> uh + <braunr> and as pfinet is creating threads to service new messages, already + existing threads are starved and can't continue + <braunr> which leads to pfinet exhausting its address space with thread + stacks (at about 30k threads) + <braunr> i initially thought it was a deadlock, but my modified libpthread + didn't detect one, and indeed, after i killed fakeroot (the whole + dpkg-buildpackage process hierarchy), pfinet just "cooled down" + <braunr> with almost all 30k threads simply waiting for requests to + service, and the few expected select calls blocking (a few ssh sessions, + exim probably, possibly others) + <braunr> i wonder why this doesn't happen with cthreads + <youpi> there's a 4k guard between stacks, otherwise I don't see anything + obvious + <braunr> i'll test my pthreads package with the fixed + ports_manage_port_operations_multithread + <braunr> but even if this "fix" should reduce thread creation, it doesn't + prevent the starvation i observed + <braunr> evil concurrency :p + + <braunr> youpi: hm i've just spotted an important difference actually + <braunr> youpi: glibc sched_yield is __swtch(), cthreads is + thread_switch(MACH_PORT_NULL, SWITCH_OPTION_DEPRESS, 10) + <braunr> i'll change the glibc implementation, see how it affects the whole + system + + <braunr> youpi: do you think bootsting the priority or cancellation + requests is an acceptable workaround ? + <braunr> boosting + <braunr> of* + <youpi> workaround for what? + <braunr> youpi: the starvation i described earlier + <youpi> well, I guess I'm not into the thing enough to understand + <youpi> you meant the dead port notifications, right? + <braunr> yes + <braunr> they are the cancellation triggers + <youpi> cancelling whaT? + <braunr> a blocking select for example + <braunr> ports_do_mach_notify_dead_name -> ports_dead_name -> + ports_interrupt_notified_rpcs -> hurd_thread_cancel + <braunr> so it's important they are processed quickly, to allow blocking + threads to unblock, reply, and be recycled + <youpi> you mean the threads in pfinet? + <braunr> the issue applies to all servers, but yes + <youpi> k + <youpi> well, it can not not be useful :) + <braunr> whatever the choice, it seems to be there will be a security issue + (a denial of service of some kind) + <youpi> well, it's not only in that case + <youpi> you can always queue a lot of requests to a server + <braunr> sure, i'm just focusing on this particular problem + <braunr> hm + <braunr> max POLICY_TIMESHARE or min POLICY_FIXEDPRI ? + <braunr> i'd say POLICY_TIMESHARE just in case + <braunr> (and i'm not sure mach handles fixed priority threads first + actually :/) + <braunr> hm my current hack which consists of calling swtch_pri(0) from a + freshly created thread seems to do the job eh + <braunr> (it may be what cthreads unintentionally does by acquiring a spin + lock from the entry function) + <braunr> not a single issue any more with this hack + <bddebian> Nice + <braunr> bddebian: well it's a hack :p + <braunr> and the problem is that, in order to boost a thread's priority, + one would need to implement that in libpthread + <bddebian> there isn't thread priority in libpthread? + <braunr> it's not implemented + <bddebian> Interesting + <braunr> if you want to do it, be my guest :p + <braunr> mach should provide the basic stuff for a partial implementation + <braunr> but for now, i'll fall back on the hack, because that's what + cthreads "does", and it's "reliable enough" + + <antrik> braunr: I don't think the locking approach in + ports_manage_port_operations_multithread() could cause issues. the worst + that can happen is that some other thread becomes idle between the check + and creating a new thread -- and I can't think of a situation where this + could have any impact... + <braunr> antrik: hm ? + <braunr> the worst case is that many threads will evalute spawn to 1 and + create threads, whereas only one of them should have + <antrik> braunr: I'm not sure perror() is a good way to handle the + situation where thread creation failed. this would usually happen because + of resource shortage, right? in that case, it should work in non-debug + builds too + <braunr> perror isn't specific to debug builds + <braunr> i'm building glibc packages with a pthreads-enabled hurd :> + <braunr> (which at one point run the test allocating and filling 2 GiB of + memory, which passed) + <braunr> (with a kernel using a 3/1 split of course, swap usage reached + something like 1.6 GiB) + <antrik> braunr: BTW, I think the observation that thread storms tend to + happen on destroying stuff more than on creating stuff has been made + before... + <braunr> ok + <antrik> braunr: you are right about perror() of course. brain fart -- was + thinking about assert_perror() + <antrik> (which is misused in some places in existing Hurd code...) + <antrik> braunr: I still don't see the issue with the "spawn" + locking... the only situation where this code can be executed + concurrently is when multiple threads are idle and handling incoming + request -- but in that case spawning does *not* happen anyways... + <antrik> unless you are talking about something else than what I'm thinking + of... + <braunr> well imagine you have idle threads, yes + <braunr> let's say a lot like a thousand + <braunr> and the server gets a thousand requests + <braunr> a one more :p + <braunr> normally only one thread should be created to handle it + <braunr> but here, the worst case is that all threads run internal_demuxer + roughly at the same time + <braunr> and they all determine they need to spawn a thread + <braunr> leading to another thousand + <braunr> (that's extreme and very unlikely in practice of course) + <antrik> oh, I see... you mean all the idle threads decide that no spawning + is necessary; but before they proceed, finally one comes in and decides + that it needs to spawn; and when the other ones are scheduled again they + all spawn unnecessarily? + <braunr> no, spawn is a local variable + <braunr> it's rather, all idle threads become busy, and right before + servicing their request, they all decide they must spawn a thread + <antrik> I don't think that's how it works. changing the status to busy (by + decrementing the idle counter) and checking that there are no idle + threads is atomic, isn't it? + <braunr> no + <antrik> oh + <antrik> I guess I should actually look at that code (again) before + commenting ;-) + <braunr> let me check + <braunr> no sorry you're right + <braunr> so right, you can't lead to that situation + <braunr> i don't even understand how i can't see that :/ + <braunr> let's say it's the heat :p + <braunr> 22:08 < braunr> so right, you can't lead to that situation + <braunr> it can't lead to that situation + + +## IRC, freenode, #hurd, 2012-08-18 + + <braunr> one more attempt at fixing netdde, hope i get it right this time + <braunr> some parts assume a ddekit thread is a cthread, because they share + the same address + <braunr> it's not as easy when using pthread_self :/ + <braunr> good, i got netdde work with pthreads + <braunr> youpi: for reference, there are now glibc, hurd and netdde + packages on my repository + <braunr> youpi: the debian specific patches can be found at my git + repository (http://git.sceen.net/rbraun/debian_hurd.git/ and + http://git.sceen.net/rbraun/debian_netdde.git/) + <braunr> except a freeze during boot (between exec and init) which happens + rarely, and the starvation which still exists to some extent (fakeroot + can cause many threads to be created in pfinet and pflocal), the + glibc/hurd packages have been working fine for a few days now + <braunr> the threading issue in pfinet/pflocal is directly related to + select, which the io_select_timeout patches should fix once merged + <braunr> well, considerably reduce at least + <braunr> and maybe fix completely, i'm not sure + + +## IRC, freenode, #hurd, 2012-08-27 + + <pinotree> braunr: wrt a78a95d in your pthread branch of hurd.git, + shouldn't that job theorically been done using pthread api (of course + after implementing it)? + <braunr> pinotree: sure, it could be done through pthreads + <braunr> pinotree: i simply restricted myself to moving the hurd to + pthreads, not augment libpthread + <braunr> (you need to remember that i work on hurd with pthreads because it + became a dependency of my work on fixing select :p) + <braunr> and even if it wasn't the reason, it is best to do these tasks + (replace cthreads and implement pthread scheduling api) separately + <pinotree> braunr: hm ok + <pinotree> implementing the pthread priority bits could be done + independently though + + <braunr> youpi: there are more than 9000 threads for /hurd/streamio kmsg on + ironforge oO + <youpi> kmsg ?! + <youpi> it's only /dev/klog right? + <braunr> not sure but it seems so + <pinotree> which syslog daemon is running? + <youpi> inetutils + <youpi> I've restarted the klog translator, to see whether when it grows + again + + <braunr> 6 hours and 21 minutes to build glibc on darnassus + <braunr> pfinet still runs only 24 threads + <braunr> the ext2 instance used for the build runs 2k threads, but that's + because of the pageouts + <braunr> so indeed, the priority patch helps a lot + <braunr> (pfinet used to have several hundreds, sometimes more than a + thousand threads after a glibc build, and potentially increasing with + each use of fakeroot) + <braunr> exec weights 164M eww, we definitely have to fix that leak + <braunr> the leaks are probably due to wrong mmap/munmap usage + +[[exec_leak]]. + + +### IRC, freenode, #hurd, 2012-08-29 + + <braunr> youpi: btw, after my glibc build, there were as little as between + 20 and 30 threads for pflocal and pfinet + <braunr> with the priority patch + <braunr> ext2fs still had around 2k because of pageouts, but that's + expected + <youpi> ok + <braunr> overall the results seem very good and allow the switch to + pthreads + <youpi> yep, so it seems + <braunr> youpi: i think my first integration branch will include only a few + changes, such as this priority tuning, and the replacement of + condition_implies + <youpi> sure + <braunr> so we can push the move to pthreads after all its small + dependencies + <youpi> yep, that's the most readable way + + +## IRC, freenode, #hurd, 2012-09-03 + + <gnu_srs> braunr: Compiling yodl-3.00.0-7: + <gnu_srs> pthreads: real 13m42.460s, user 0m0.000s, sys 0m0.030s + <gnu_srs> cthreads: real 9m 6.950s, user 0m0.000s, sys 0m0.020s + <braunr> thanks + <braunr> i'm not exactly certain about what causes the problem though + <braunr> it could be due to libpthread using doubly-linked lists, but i + don't think the overhead would be so heavier because of that alone + <braunr> there is so much contention sometimes that it could + <braunr> the hurd would have been better off with single threaded servers + :/ + <braunr> we should probably replace spin locks with mutexes everywhere + <braunr> on the other hand, i don't have any more starvation problem with + the current code + + +### IRC, freenode, #hurd, 2012-09-06 + + <gnu_srs> braunr: Yes you are right, the new pthread-based Hurd is _much_ + slower. + <gnu_srs> One annoying example is when compiling, the standard output is + written in bursts with _long_ periods of no output in between:-( + <braunr> that's more probably because of the priority boost, not the + overhead + <braunr> that's one of the big issues with our mach-based model + <braunr> we either give high priorities to our servers, or we can suffer + from message floods + <braunr> that's in fact more a hurd problem than a mach one + <gnu_srs> braunr: any immediate ideas how to speed up responsiveness the + pthread-hurd. It is annoyingly slow (slow-witted) + <braunr> gnu_srs: i already answered that + <braunr> it doesn't look that slower on my machines though + <gnu_srs> you said you had some ideas, not which. except for mcsims work. + <braunr> i have ideas about what makes it slower + <braunr> it doesn't mean i have solutions for that + <braunr> if i had, don't you think i'd have applied them ? :) + <gnu_srs> ok, how to make it more responsive on the console? and printing + stdout more regularly, now several pages are stored and then flushed. + <braunr> give more details please + <gnu_srs> it behaves like a loaded linux desktop, with little memory + left... + <braunr> details about what you're doing + <gnu_srs> apt-get source any big package and: fakeroot debian/rules binary + 2>&1 | tee ../binary.logg + <braunr> isee + <braunr> well no, we can't improve responsiveness + <braunr> without reintroducing the starvation problem + <braunr> they are linked + <braunr> and what you're doing involes a few buffers, so the laggy feel is + expected + <braunr> if we can fix that simply, we'll do so after it is merged upstream + + +### IRC, freenode, #hurd, 2012-09-07 + + <braunr> gnu_srs: i really don't feel the sluggishness you described with + hurd+pthreads on my machines + <braunr> gnu_srs: what's your hardware ? + <braunr> and your VM configuration ? + <gnu_srs> Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz + <gnu_srs> kvm -m 1024 -net nic,model=rtl8139 -net + user,hostfwd=tcp::5562-:22 -drive + cache=writeback,index=0,media=disk,file=hurd-experimental.img -vnc :6 + -cdrom isos/netinst_2012-07-15.iso -no-kvm-irqchip + <braunr> what is the file system type where your disk image is stored ? + <gnu_srs> ext3 + <braunr> and how much physical memory on the host ? + <braunr> (paste meminfo somewhere please) + <gnu_srs> 4G, and it's on the limit, 2 kvm instances+gnome,etc + <gnu_srs> 80% in use by programs, 14% in cache. + <braunr> ok, that's probably the reason then + <braunr> the writeback option doesn't help a lot if you don't have much + cache + <gnu_srs> well the other instance is cthreads based, and not so sluggish. + <braunr> we know hurd+pthreads is slower + <braunr> i just wondered why i didn't feel it that much + <gnu_srs> try to fire up more kvm instances, and do a heavy compile... + <braunr> i don't do that :) + <braunr> that's why i never had the problem + <braunr> most of the time i have like 2-3 GiB of cache + <braunr> and of course more on shattrath + <braunr> (the host of the sceen.net hurdboxes, which has 16 GiB of ram) + + +### IRC, freenode, #hurd, 2012-09-11 + + <gnu_srs> Monitoring the cthreads and the pthreads load under Linux shows: + <gnu_srs> cthread version: load can jump very high, less cpu usage than + pthread version + <gnu_srs> pthread version: less memory usage, background cpu usage higher + than for cthread version + <braunr> that's the expected behaviour + <braunr> gnu_srs: are you using the lifothreads gnumach kernel ? + <gnu_srs> for experimental, yes. + <gnu_srs> i.e. pthreads + <braunr> i mean, you're measuring on it right now, right ? + <gnu_srs> yes, one instance running cthreads, and one pthreads (with lifo + gnumach) + <braunr> ok + <gnu_srs> no swap used in either instance, will try a heavy compile later + on. + <braunr> what for ? + <gnu_srs> E.g. for memory when linking. I have swap available, but no swap + is used currently. + <braunr> yes but, what do you intend to measure ? + <gnu_srs> don't know, just to see if swap is used at all. it seems to be + used not very much. + <braunr> depends + <braunr> be warned that using the swap means there is pageout, which is one + of the triggers for global system freeze :p + <braunr> anonymous memory pageout + <gnu_srs> for linux swap is used constructively, why not on hurd? + <braunr> because of hard to squash bugs + <gnu_srs> aha, so it is bugs hindering swap usage:-/ + <braunr> yup :/ + <gnu_srs> Let's find them thenO:-), piece of cake + <braunr> remember my page cache branch in gnumach ? :) + +[[gnumach_page_cache_policy]]. + + <gnu_srs> not much + <braunr> i started it before fixing non blocking select + <braunr> anyway, as a side effect, it should solve this stability issue + too, but it'll probably take time + <gnu_srs> is that branch integrated? I only remember slab and the lifo + stuff. + <gnu_srs> and mcsims work + <braunr> no it's not + <braunr> it's unfinished + <gnu_srs> k! + <braunr> it correctly extends the page cache to all available physical + memory, but since the hurd doesn't scale well, it slows the system down + + +## IRC, freenode, #hurd, 2012-09-14 + + <braunr> arg + <braunr> darnassus seems to eat 100% cpu and make top freeze after some + time + <braunr> seems like there is an important leak in the pthreads version + <braunr> could be the lifothreads patch :/ + <cjbirk> there's a memory leak? + <cjbirk> in pthreads? + <braunr> i don't think so, and it's not a memory leak + <braunr> it's a port leak + <braunr> probably in the kernel + + +### IRC, freenode, #hurd, 2012-09-17 + + <braunr> nice, the port leak is actually caused by the exim4 loop bug + + +### IRC, freenode, #hurd, 2012-09-23 + + <braunr> the port leak i observed a few days ago is because of exim4 (the + infamous loop eating the cpu we've been seeing regularly) + +[[fork_deadlock]]? + + <youpi> oh + <braunr> next time it happens, and if i have the occasion, i'll examine the + problem + <braunr> tip: when you can't use top or ps -e, you can use ps -e -o + pid=,args= + <youpi> or -M ? + <braunr> haven't tested + + +## IRC, freenode, #hurd, 2012-09-23 + + <braunr> tschwinge: i committed the last hurd pthread change, + http://git.savannah.gnu.org/cgit/hurd/hurd.git/log/?h=master-pthreads + <braunr> tschwinge: please tell me if you consider it ok for merging + + +### IRC, freenode, #hurd, 2012-11-27 + + <youpi> braunr: btw, I forgot to forward here, with the glibc patch it does + boot fine, I'll push all that and build some almost-official packages for + people to try out what will come when eglibc gets the change in unstable + <braunr> youpi: great :) + <youpi> thanks for managing the final bits of this + <youpi> (and thanks for everybody involved) + <braunr> sorry again for the non obvious parts + <braunr> if you need the debian specific parts refined (e.g. nice commits + for procfs & others), i can do that + <youpi> I'll do that, no pb + <braunr> ok + <braunr> after that (well, during also), we should focus more on bug + hunting + + +## IRC, freenode, #hurd, 2012-10-26 + + <mcsim1> hello. What does following error message means? "unable to adjust + libports thread priority: Operation not permitted" It appears when I set + translators. + <mcsim1> Seems has some attitude to libpthread. Also following appeared + when I tried to remove translator: "pthread_create: Resource temporarily + unavailable" + <mcsim1> Oh, first message appears very often, when I use translator I set. + <braunr> mcsim1: it's related to a recent patch i sent + <braunr> mcsim1: hurd servers attempt to increase their priority on startup + (when a thread is created actually) + <braunr> to reduce message floods and thread storms (such sweet names :)) + <braunr> but if you start them as an unprivileged user, it fails, which is + ok, it's just a warning + <braunr> the second way is weird + <braunr> it normally happens when you're out of available virtual space, + not when shutting a translator donw + <mcsim1> braunr: you mean this patch: libports: reduce thread starvation on + message floods? + <braunr> yes + <braunr> remember you're running on darnassus + <braunr> with a heavily modified hurd/glibc + <braunr> you can go back to the cthreads version if you wish + <mcsim1> it's better to check translators privileges, before attempting to + increase their priority, I think. + <braunr> no + <mcsim1> it's just a bit annoying + <braunr> privileges can be changed during execution + <braunr> well remove it + <mcsim1> But warning should not appear. + <braunr> what could be done is to limit the warning to one occurrence + <braunr> mcsim1: i prefer that it appears + <mcsim1> ok + <braunr> it's always better to be explicit and verbose + <braunr> well not always, but very often + <braunr> one of the reasons the hurd is so difficult to debug is the lack + of a "message server" à la dmesg + +[[translator_stdout_stderr]]. |