[[!meta copyright="Copyright © 2010, 2011, 2012 Free Software Foundation, Inc."]] [[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable id="license" text="Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled [[GNU Free Documentation License|/fdl]]."]]"""]] [[!tag open_issue_glibc]] There are a lot of reports about this issue, but no thorough analysis. # Short Timeouts ## `elinks` IRC, unknown channel, unknown date: <paakku> This is related to ELinks... I've looked at the select() implementation for the Hurd in glibc and it seems that giving it a short timeout could cause it not to report that file descriptors are ready. <paakku> It sends a request to the Mach port of each file descriptor and then waits for responses from the servers. <paakku> Even if the file descriptors have data for reading or are ready for writing, the server processes might not respond immediately. <paakku> So if I want ELinks to check which file descriptors are ready, how long should the timeout be in order to ensure that all servers can respond in time? <paakku> Or do I just imagine this problem? ## [[dbus]] ## IRC ### IRC, freenode, #hurd, 2012-01-31 <braunr> don't you find vim extremely slow lately ? <braunr> (and not because of cpu usage but rather unnecessary sleeps) <jkoenig> yes. <braunr> wasn't there a discussion to add a minimum timeout to mach_msg for select() or something like that during the past months ? <youpi> there was, and it was added <youpi> that could be it <youpi> I don't want to drop it though, some app really need it <braunr> as a debian patch only iirc ? <youpi> yes <braunr> ok <braunr> if i'm right, the proper solution was to fix remote servers instead of client calls <youpi> (no drop, unless the actual bug gets fixed of course) <braunr> so i'm guessing it's just a hack in between <youpi> not only <youpi> with a timeout of zero, mach will just give *no* time for the servers to give an answer <braunr> that's because the timeout is part of the client call <youpi> so the protocol has to be rethought, both server/client side <braunr> a suggested solution was to make it a parameter <braunr> i mean, part of the message <braunr> not a mach_msg parameter <jkoenig> OTOH the servers should probably not be trusted to enforce the timeout. <braunr> why ? <jkoenig> they're not necessarily trusted. (but then again, that's not the only circumstances where that's a problem) <braunr> there is a proposed solution for that too (trust root and self servers only by default) <jkoenig> I'm not sure they're particularily easy to identify in the general case <braunr> "they" ? the solutions you mean ? <braunr> or the servers ? <youpi> jkoenig: you can't trust the servers in general to provide an answer, timeout or not <jkoenig> yes the root/self servers. <braunr> ah <youpi> jkoenig: you can stat the actual node before dereferencing the translator <jkoenig> could they not report FD activity asynchronously to the message port? libc would cache the state <youpi> I don't understand what you mean <youpi> anyway, really making the timeout part of the message is not a problem <braunr> 10:10 < youpi> jkoenig: you can't trust the servers in general to provide an answer, timeout or not <youpi> we already trust everything (e.g. read() ) into providing an answer immediately <braunr> i don't see why <youpi> braunr: put sleep(1) in S_io_read() <youpi> it'll not give you an immediate answer, O_NODELAY being set or not <braunr> well sleep is evil, but let's just say the server thread blocks <braunr> ok <braunr> well fix the server <youpi> so we agree <braunr> ? <youpi> in the current security model, we trust the server into achieve the timeout <braunr> yes <youpi> and jkoenig's remark is more global than just select() <braunr> taht's why we must make sure we're contacting trusted servers by default <youpi> it affects read() too <braunr> sure <youpi> so there's no reason not to fix select() <youpi> that's the important point <braunr> but this doesn't mean we shouldn't pass the timeout to the server and expect it to handle it correctly <youpi> we keep raising issues with things, and not achieve anything, in the Hurd <braunr> if it doesn't, then it's a bug, like in any other kernel type <youpi> I'm not the one to convince :) <braunr> eh, some would say it's one of the goals :) <braunr> who's to be convinced then ? <youpi> jkoenig: <youpi> who raised the issue <braunr> ah <youpi> well, see the irc log :) <jkoenig> not that I'm objecting to any patch, mind you :-) <braunr> i didn't understand it that way <braunr> if you can't trust the servers to act properly, it's similar to not trusting linux fs code <youpi> no, the difference is that servers can be non-root <youpi> while on linux they can't <braunr> again, trust root and self <youpi> non-root fuse mounts are not followed by default <braunr> as with fuse <youpi> that's still to be written <braunr> yes <youpi> and as I said, you can stat the actual node and then dereference the translator afterwards <braunr> but before writing anything, we'd better agree on the solution :) <youpi> which, again, "just" needs to be written <antrik> err... adding a timeout to mach_msg()? that's just wrong <antrik> (unless I completely misunderstood what this discussion was about...) #### IRC, freenode, #hurd, 2012-02-04 <youpi> this is confirmed: the select hack patch hurts vim performance a lot <youpi> I'll use program_invocation_short_name to make the patch even more ugly <youpi> (of course, we really need to fix select somehow) <pinotree> could it (also) be that vim uses select() somehow "badly"? <youpi> fsvo "badly", possibly, but still <gnu_srs1> Could that the select() stuff be the reason for a ten times slower ethernet too, e.g. scp and apt-get? <pinotree> i didn't find myself neither scp nor apt-get slower, unlike vim <youpi> see strace: scp does not use select <youpi> (I haven't checked apt yet) ### IRC, freenode, #hurd, 2012-02-14 <braunr> on another subject, I'm wondering how to correctly implement select/poll with a timeout on a multiserver system :/ <braunr> i guess a timeout of 0 should imply a non blocking round-trip to servers only <braunr> oh good, the timeout is already part of the io_select call ### IRC, freenode, #hurdfr, 2012-02-22 <braunr> le gros souci de notre implé, c'est que le timeout de select est un paramètre client <braunr> un paramètre passé directement à mach_msg <braunr> donc si tu mets un timeout à 0, y a de fortes chances que mach_msg retourne avant même qu'un RPC puisse se faire entièrement (round-trip client-serveur donc) <braunr> et donc quand le timeout est à 0 pour du non bloquant, ben tu bloques pas, mais t'as pas tes évènements .. <abique|work> peut-être que passer le timeout de 10ms à 10 us améliorerait la situation. <abique|work> car 10ms c'est un peut beaucoup :) <braunr> c'est l'interval timer système historique unix <braunr> et mach n'est pas préemptible <braunr> donc c'est pas envisageable en l'état <braunr> ceci dit c'est pas complètement lié <braunr> enfin si, il nous faudrait qqchose de similaire aux high res timers de linux <braunr> enfin soit des timer haute résolution, soit un timer programmable facilement <braunr> actuellement il n'y a que le 8254 qui est programmé, et pour assurer un scheduling à peu près correct, il est programmé une fois, à 10ms, et basta <braunr> donc oui, préciser 1ms ou 1us, ça changera rien à l'interval nécessaire pour déterminer que le timer a expiré ### IRC, freenode, #hurd, 2012-02-27 <youpi> braunr: extremely dirty hack <youpi> I don't even want to detail :) <braunr> oh <braunr> does it affect vim only ? <braunr> or all select users ? <youpi> we've mostly seen it with vim <youpi> but possibly fakeroot has some issues too <youpi> it's very little probable that only vim has the issue :) <braunr> i mean, is it that dirty to switch behaviour depending on the calling program ? <youpi> not all select users <braunr> ew :) <youpi> just those which do select({0,0}) <braunr> well sure <youpi> braunr: you guessed right :) <braunr> thanks anyway <braunr> it's probably a good thing to do currently <braunr> vim was getting me so mad i was using sshfs lately <youpi> it's better than nothing yes # See Also See also [[select_bogus_fd]] and [[select_vs_signals]].