diff options
Diffstat (limited to 'open_issues/boehm_gc.mdwn')
-rw-r--r-- | open_issues/boehm_gc.mdwn | 553 |
1 files changed, 0 insertions, 553 deletions
diff --git a/open_issues/boehm_gc.mdwn b/open_issues/boehm_gc.mdwn deleted file mode 100644 index 2913eea8..00000000 --- a/open_issues/boehm_gc.mdwn +++ /dev/null @@ -1,553 +0,0 @@ -[[!meta copyright="Copyright © 2010, 2012, 2013, 2014 Free Software Foundation, -Inc."]] - -[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable -id="license" text="Permission is granted to copy, distribute and/or modify this -document under the terms of the GNU Free Documentation License, Version 1.2 or -any later version published by the Free Software Foundation; with no Invariant -Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license -is included in the section entitled [[GNU Free Documentation -License|/fdl]]."]]"""]] - -Here's what's to be done for maintaining Boehm GC. - -This one does need Hurd-specific configuration. - -It is, for example, used by [[/GCC]] (which has its own fork), so any changes -committed upstream should very like also be made there. - -[[!toc levels=2]] - - -# [[General information|/boehm_gc]] - - -# Configuration - -<!-- - -git checkout reviewed -git log --reverse --pretty=fuller --stat=$COLUMNS,$COLUMNS -w -p -C --cc ..upstream/master --i -/^commit |^---$|hurd|linux|glibc - ---> - -Last reviewed up to the 5f492b98dd131bdd6c67eb56c31024420c1e7dab (2012-06-08) -sources, and for `libatomic_ops` to the -6a0afde033f105c6320f1409162e3765a1395bfd (2012-05-15) sources. - - * `configure.ac` - - * `PARALLEL_MARK` is not enabled; doesn't make sense so far. - - * `*-*-kfreebsd*-gnu` defines `USE_COMPILER_TLS`. What's this, and - why does not other config? - - * TODO - - [ if test "$enable_gc_debug" = "yes"; then - AC_MSG_WARN("Should define GC_DEBUG and use debug alloc. in clients.") - AC_DEFINE([KEEP_BACK_PTRS], 1, - [Define to save back-pointers in debugging headers.]) - keep_back_ptrs=true - AC_DEFINE([DBG_HDRS_ALL], 1, - [Define to force debug headers on all objects.]) - case $host in - x86-*-linux* | i586-*-linux* | i686-*-linux* | x86_64-*-linux* ) - AC_DEFINE(MAKE_BACK_GRAPH) - AC_MSG_WARN("Client must not use -fomit-frame-pointer.") - AC_DEFINE(SAVE_CALL_COUNT, 8) - ;; - AM_CONDITIONAL([KEEP_BACK_PTRS], [test x"$keep_back_ptrs" = xtrue]) - - * `configure.host` - - Nothing. - - * `Makefile.am`, `include/include.am`, `cord/cord.am`, `doc/doc.am`, - `tests/tests.am` - - Nothing. - - * `include/gc_config_macros.h` - - Should be OK. - - * `include/private/gcconfig.h` - - Hairy. But should be OK. Search for *HURD*, compare to *LINUX*, - *I386* case. - - See `doc/porting.html` and `doc/README.macros` (and others) for - documentation. - - *LINUX* has: - - * `#define LINUX_STACKBOTTOM` - - Defined instead of `STACKBOTTOM` to have the value read from `/proc/`. - - * `#define HEAP_START (ptr_t)0x1000` - - May want to define it for us, too? - - * `#ifdef USE_I686_PREFETCH`, `USE_3DNOW_PREFETCH` --- [...] - - Apparently these are optimization that we also could use. Have a - look at *LINUX* for *X86_64*, which uses `__builtin_prefetch` - (which Linux x86 could use, too?). - - * TODO - - #if defined(LINUX) && defined(USE_MMAP) - /* The kernel may do a somewhat better job merging mappings etc. */ - /* with anonymous mappings. */ - # define USE_MMAP_ANON - #endif - - * TODO - - #if defined(GC_LINUX_THREADS) && defined(REDIRECT_MALLOC) - /* Nptl allocates thread stacks with mmap, which is fine. But it */ - /* keeps a cache of thread stacks. Thread stacks contain the */ - /* thread control blocks. These in turn contain a pointer to */ - /* (sizeof (void *) from the beginning of) the dtv for thread-local */ - /* storage, which is calloc allocated. If we don't scan the cached */ - /* thread stacks, we appear to lose the dtv. This tends to */ - /* result in something that looks like a bogus dtv count, which */ - /* tends to result in a memset call on a block that is way too */ - /* large. Sometimes we're lucky and the process just dies ... */ - /* There seems to be a similar issue with some other memory */ - /* allocated by the dynamic loader. */ - /* This should be avoidable by either: */ - /* - Defining USE_PROC_FOR_LIBRARIES here. */ - /* That performs very poorly, precisely because we end up */ - /* scanning cached stacks. */ - /* - Have calloc look at its callers. */ - /* In spite of the fact that it is gross and disgusting. */ - /* In fact neither seems to suffice, probably in part because */ - /* even with USE_PROC_FOR_LIBRARIES, we don't scan parts of stack */ - /* segments that appear to be out of bounds. Thus we actually */ - /* do both, which seems to yield the best results. */ - - # define USE_PROC_FOR_LIBRARIES - #endif - - * TODO - - # if defined(GC_LINUX_THREADS) && defined(REDIRECT_MALLOC) \ - && !defined(INCLUDE_LINUX_THREAD_DESCR) - /* Will not work, since libc and the dynamic loader use thread */ - /* locals, sometimes as the only reference. */ - # define INCLUDE_LINUX_THREAD_DESCR - # endif - - * TODO - - # if defined(UNIX_LIKE) && defined(THREADS) && !defined(NO_CANCEL_SAFE) \ - && !defined(PLATFORM_ANDROID) - /* Make the code cancellation-safe. This basically means that we */ - /* ensure that cancellation requests are ignored while we are in */ - /* the collector. This applies only to Posix deferred cancellation;*/ - /* we don't handle Posix asynchronous cancellation. */ - /* Note that this only works if pthread_setcancelstate is */ - /* async-signal-safe, at least in the absence of asynchronous */ - /* cancellation. This appears to be true for the glibc version, */ - /* though it is not documented. Without that assumption, there */ - /* seems to be no way to safely wait in a signal handler, which */ - /* we need to do for thread suspension. */ - /* Also note that little other code appears to be cancellation-safe.*/ - /* Hence it may make sense to turn this off for performance. */ - # define CANCEL_SAFE - # endif - - * `CAN_SAVE_CALL_ARGS` vs. -fomit-frame-pointer now being on by - default for Linux x86 IIRC? (Which is an [[!taglink - open_issue_gcc]] for not including us.) - - * TODO - - # if defined(REDIRECT_MALLOC) && defined(THREADS) && !defined(LINUX) - # error "REDIRECT_MALLOC with THREADS works at most on Linux." - # endif - - - *HURD* has: - - * `#define STACK_GROWS_DOWN` - - * `#define HEURISTIC2` - - Defined instead of `STACKBOTTOM` to have the value probed. - - Linux also has this: - - #if defined(LINUX_STACKBOTTOM) && defined(NO_PROC_STAT) \ - && !defined(USE_LIBC_PRIVATES) - /* This combination will fail, since we have no way to get */ - /* the stack base. Use HEURISTIC2 instead. */ - # undef LINUX_STACKBOTTOM - # define HEURISTIC2 - /* This may still fail on some architectures like IA64. */ - /* We tried ... */ - #endif - - Being on [[glibc]], we could perhaps do similar as `USE_LIBC_PRIVATES` - instead of `HEURISTIC2`. Pro: avoid `SIGSEGV` (and general fragility) - during probing at startup (if I'm understanding this correctly). Con: - rely on glibc internals. Or we instead add support to parse - [[`/proc/`|hurd/translator/procfs]] (can even use the same as Linux?), - or use some other interface. [[!tag open_issue_glibc]] - This is also likely the issue causing the GDB [[!tag open_issue_gdb]] - `GC_find_limit_with_bound` SIGSEGV startup confusion described in - [[binutils]]. - - * `#define SIG_SUSPEND SIGUSR1`, `#define SIG_THR_RESTART SIGUSR2` - - * We don't `#define MPROTECT_VDB` (WIP comment); but Linux neither. - - * Where does our `GETPAGESIZE` come from? Should we `#include - <unistd.h>` like it is done for *LINUX*? - - * `include/gc_pthread_redirects.h` - - * TODO - - Cancellation stuff is Linux-only. In other places, too. - - * `mach_dep.c` - - * `#define NO_GETCONTEXT` - - [[!taglink open_issue_glibc]], but this is not a real problem here, - because we can use the following GCC internal function without much - overhead: - - * `GC_with_callee_saves_pushed` - - The `HAVE_BUILTIN_UNWIND_INIT` case is ours. - - * `os_dep.c` - - * `read` - - Sure that it doesn't internally (in [[glibc]]) use `malloc`. Probably - only / mostly (?) a problem for `--enable-redirect-malloc` - configurations? Linux with threads uses `readv`. - - * TODO. - - * `dyn_load.c` - - For `DYNAMIC_LOADING`. TODO. - - * `pthread_support.c`, `pthread_stop_world.c` - - TODO. - - * TODO. - - Other files also contain *LINUX* and other conditionals. - - * `libatomic_ops/` - - * `configure.ac` - - Nothing. - - * `Makefile`, `src/Makefile`, `src/atomic_ops/Makefile`, - `src/atomic_ops/sysdeps/Makefile`, `doc/Makefile`, `tests/Makefile` - - Nothing. - - * `src/atomic_ops/sysdeps/gcc/x86.h` - - Nothing. - - * b8b65e8a5c2c4896728cd00d008168a6293f55b1 configure.ac probably not all - correct. - - * `mmap`, b64dd3bc1e5a23e677c96b478d55648a0730ab75 - - * `parallel mark`, 07c2b8e455c9e70d1f173475bbf1196320812154, pass - `--disable-parallel-mark` or enable for us, too? - - * `HANDLE_FORK`, e9b11b6655c45ad3ab3326707aa31567a767134b, - 806d656802a1e3c2b55cd9e4530c6420340886c9, - 1e882b98c2cf9479a9cd08a67439dab7f9622924 - - * Check `include/private/thread_local_alloc.h` re - `USE_COMPILER_TLS`/`USE_PTHREAD_SPECIFIC`. - - -# Build - -Here's a log of a binutils build run; this is from the -5f492b98dd131bdd6c67eb56c31024420c1e7dab (2012-06-08) sources, and for -`libatomic_ops` for the 6a0afde033f105c6320f1409162e3765a1395bfd (2012-05-15) -sources, run on kepler.SCHWINGE and coulomb.SCHWINGE. - - $ export LC_ALL=C - $ (cd ../master/ && ln -sfn ../libatomic_ops/master libatomic_ops) - $ (cd ../master/ && autoreconf -vfi) - $ ../master/configure --prefix="$PWD".install SHELL=/bin/bash CC=gcc-4.6 CXX=g++-4.6 --enable-cplusplus --enable-gc-debug --enable-gc-assertions --enable-assertions 2>&1 | tee log_build - [...] - $ make 2>&1 | tee log_build_ - [...] - -Different hosts may default to different shells and compiler versions; thus -harmonized. Using bash instead of dash as otherwise libtool explodes. - -This takes up around X MiB, and needs roughly X min on kepler.SCHWINGE and -X min on coulomb.SCHWINGE. - -<!-- - - $ (make && touch .go-install) 2>&1 | tee log_build_ && test -f .go-install && (make install && touch .go-check) 2>&1 | tee log_install && test -f .go-check && { make -k check 2>&1 | tee log_check; (cd libatomic_ops/ && make -k check) 2>&1 | tee log_check_; } - ---> - -## Analysis - - $ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_build* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_build - $ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_build* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_build - $ diff -wu <(sed -f toolchain/logs/boehm-gc/linux/log_build.sed < toolchain/logs/boehm-gc/linux/log_build) <(sed -f toolchain/logs/boehm-gc/hurd/log_build.sed < toolchain/logs/boehm-gc/hurd/log_build) > toolchain/logs/boehm-gc/log_build.diff - - * only GNU/Linux: `configure: WARNING: "Explicit GC_INIT() calls may be - required."` - - * only GNU/Linux: `configure: WARNING: "Client must not use - -fomit-frame-pointer."` - - -# Install - - $ make install 2>&1 | tee log_install - [...] - -This takes up around X MiB, and needs roughly X min on kepler.SCHWINGE and X -min on coulomb.SCHWINGE. - - -## Analysis - - $ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_install | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_install - $ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_install | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_install - $ diff -wu toolchain/logs/boehm-gc/linux/log_install toolchain/logs/boehm-gc/hurd/log_install > toolchain/logs/boehm-gc/log_install.diff - - -# Testsuite - - $ make -k check - [...] - $ (cd libatomic_ops/ && make -k check) - [...] - -This needs roughly X min on kepler.SCHWINGE and X min on coulomb.SCHWINGE. - - -## Analysis - - $ ssh kepler.SCHWINGE 'cd tmp/source/boehm-gc/ && cat master.build/log_check* | sed -e "s%\(/media/data\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/linux/log_check - $ ssh coulomb.SCHWINGE 'cd tmp/boehm-gc/ && cat master.build/log_check* | sed -e "s%\(/media/erich\)\?${PWD}%[...]%g"' > toolchain/logs/boehm-gc/hurd/log_check - $ diff -wu <(sed -f toolchain/logs/boehm-gc/linux/log_check.sed < toolchain/logs/boehm-gc/linux/log_check) <(sed -f toolchain/logs/boehm-gc/hurd/log_check.sed < toolchain/logs/boehm-gc/hurd/log_check) > toolchain/logs/boehm-gc/log_check.diff - -There are different configurations possible, but in general, the testsuite -restults of GNU/Linux and GNU/Hurd look very similar. - - * GNU/Hurd is missing `Call chain at allocation: [...]` output. - - `os_dep.c`:`GC_print_callers` - - -# TODO - - * What are other applications to test Boehm GC? Also especially in - combination with [[/libpthread]] and dynamic loading of shared libraries? - - * There are patches (apparently not committed) that GCC itself can use - it, too: <http://gcc.gnu.org/wiki/Garbage_collection_tuning>. - - * There's been some talking about it on GNU guile mailing lists, and two - Git branches (2010-12-15: last change 2009-09). - - * <http://www.hpl.hp.com/personal/Hans_Boehm/gc/#users> - - -## IRC, OFTC, #debian-hurd, 2012-02-05 - -[[!tag open_issue_porting]] - - <pinotree> youpi: i think i found out the possible cause of the ecl and - mono issuess - <pinotree> -s - <youpi> oh - <pinotree> basically, we don't have the realtime signals (so no - SIGRTMIN/SIGRTMAX defined), hence things use either SIGUSR1 or - SIGUSR2... which are used in libgc to resp. stop/resume threads when - "collecting" - <pinotree> i just patched ecl to use SIGINFO instead of SIGUSR1 (used when - no SIGRTMIN+2 is available), and it seems going on for a while - <youpi> uh, why would SIGINFO work better than SIGUSR1? - <pinotree> it was a test, i tried the first "not common" signal i saw - <pinotree> my test was, use any signal different than USR1/2 - <youpi> ah, sorry, I hadn't understood - <youpi> you mean there's a conflict between ecl and mono using SIGUSR1, as - well as libgc? - <pinotree> yes - <pinotree> for example, in ecl sources see src/c/unixint.d, - install_process_interrupt_handler() - <youpi> SIGINFO seems a sane choice - <youpi> SIGPWR could have been a better choice if it was available :) - <pinotree> i would have chose an "unassigned" number, say SIGLOST (the - bigger one) + 10, but it would be greater than _NSIG (and thus discarded) - <youpi> not a good idea indeed - <pinotree> it seems that linux, beside the range for rt signals, has some - "free space" - <pinotree> i'll start now another ecl build, from scratch this time, with - s/SIGUSR1/SIGINFO/ (making sure ctags won't bother), and if it works i'll - update svante's bug - - <pinotree> mmap(...PROT_NONE...) failed - <pinotree> hmm... - <pinotree> apparently enabling MMAP_ANON in mono's libgc copy was a good - step, let's see - - -### IRC, OFTC, #debian-hurd, 2012-03-18 - - <pinotree> youpi: mono is afflicted by the SIGUSR1/2 conflict with libgc - <youpi> pinotree: didn't we have a solution for that? - <pinotree> well, it works just for one signal - <pinotree> the ideal solution would be having a range for RT signals, and - make libgc use RTMIN+5/6, like done on most of other OSes - <youpi> but we don't have RT signals, do we? - <pinotree> right :( - - -### IRC, freenode, #hurd, 2012-03-21 - - <pinotree> civodul: given we have to realtime signals (so no range of - signals for them), libgc uses SIGUSR1/2 instead of using SIGRTMIN+5/6 for - its thread synchronization stuff - <pinotree> civodul: which means that if an application using libgc then - sets its own handlers for either of SIGUSR1/2, hell breaks - <civodul> pinotree: ok - <civodul> pinotree: is it a Debian-specific change, or included upstream? - <pinotree> libgc using SIGUSR1/2? upstream - <civodul> ok - - -### IRC, freenode, #hurd, 2013-09-03 - - <congzhang> braunr: when will libc malloc say memory corruption? - <braunr> congzhang: usually on free - <braunr> sometimes on alloc - <congzhang> and after one thread be created - <congzhang> I want to know why and how to find the source - <congzhang> does libgc work well on hurd? - <braunr> i don't think it does - <congzhang> so , why it can't? - <braunr> congzhang: what ? - <congzhang> libgc was not work on hurd - <pinotree> why? - <congzhang> I try porting dotgnu - <braunr> ah - <braunr> nested signal handling - <congzhang> one program always receive Abort signal - <pinotree> and why it should be a problem in libgc? - <congzhang> for malloc memory corruption - <braunr> libgc relies on this - <congzhang> yes - <congzhang> so, is there a workaround to make it work? - <braunr> show the error please - <congzhang> http://paste.debian.net/34416/ - <pinotree> where's libgc? - <congzhang> i compile dotgnu with enable-gc - <pinotree> so? - <congzhang> I am not sure about it - <pinotree> so why did you say earlier that libgc doesn't work? - <congzhang> because after I see one thread was created notice by gdb, it - memory corruption - <pinotree> so what? - <congzhang> maybe gabage collection happen, and gc thread start - <pinotree> that's speculation - <pinotree> you cannot debug things speculating on code you don't know - <pinotree> less speculation and more in-deep debugging, please - * congzhang I try again, to check weather thread list changing - <congzhang> sorry for this - <braunr> it simply looks like a real memory corruption (an overflow) - <congzhang> maybe PATH related problem - <pinotree> PATH? - <congzhang> yes - <braunr> PATH_MAX - <braunr> but unlikely - <congzhang> csant do path traverse - <congzhang> I fond the macro - <congzhang> found - <congzhang> #if defined(__sun__) || defined(__BEOS__) - <congzhang> #define BROKEN_DIRENT 1 - <congzhang> #endif - <congzhang> and so for hurd? - <pinotree> BROKEN_DIRENT doesn't say much about what it does - <WhiteKIBA> nope - <WhiteKIBA> whoops - <congzhang> it seems other port meet the trouble too - <pinotree> which trouble? - <congzhang> http://comments.gmane.org/gmane.comp.gnu.dotgnu.developer/3642 - <congzhang> (gdb) ptype struct dirent - <congzhang> type = struct dirent { - <congzhang> __ino_t d_ino; - <congzhang> unsigned short d_reclen; - <congzhang> unsigned char d_type; - <congzhang> unsigned char d_namlen; - <congzhang> char d_name[1]; - <congzhang> } - <congzhang> - <congzhang> d_name should be char[PATH_MAX]? - <congzhang> and - http://libjit-linear-scan-register-allocator.googlecode.com/svn/trunk/pnet/support/dir.c - <pinotree> no - <braunr> stop pasting that much - <_d3f> uhm PATH_MAX on the hurd? - <braunr> and stop saying nonsense - <congzhang> sorry, i think four line was not worth to pastbin - <pinotree> they are 8 - <congzhang> never again - <braunr> just try by defining BROKEN_DIRENT to 1 in all cases and see how - it goes - * congzhang read dir.c again - <congzhang> braunr: it does not crash this time, I do more test - - -#### IRC, freenode, #hurd, 2013-09-04 - - <congzhang> hi, I am dotgnu work on hurd, and even winforms app - <congzhang> s/am/make - <congzhang> and maybe c# hello world translate another day :) - - -### IRC, freenode, #hurd, 2013-12-16 - - <braunr> gnu_srs: ah, libgc - <braunr> there are signal-related problems with libgc - - -## Leak Detection - -### IRC, freenode, #hurd, 2013-10-17 - - <teythoon> I spent the last two days integrating libgc - the boehm - conservative garbage collector - into hurd - <teythoon> it can be used in leak detection mode - <azeem> whoa, cool - <teythoon> and it actually kind of works, finds malloc leaks in translators - <braunr> i think there were problems with signal handling in libgc - <braunr> i'm not sure we support nested signal handling well - <teythoon> yes, I read about them - <teythoon> libgc uses SIGUSR1/2, so any program installing handlers on them - will break - <azeem> (which is not a problem on Linux, cause there some RT-signals or so - are used) - <teythoon> yes |