OpenJK
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    I meant a DSI by crashing. There are Wipeout alerts like this, but I can't interpret them:

    Code:
    ----------------------------------------------------------------------
    SegList 0x08c1d6d4 GlobVec 0x2001ee54
    StackBase 0x09146403 StackSize 0x00004000
    TaskNum 0x00000005 Result2 0x00000000
    CurrentDir 0x08c20f3a CIS 0x08c2120f
    COS 0x08c21221 CES 0x08cd7241
    ConsoleTask 0x23084bf0 FileSystemTask 0x2019049c
    CLI 0x08c20f51 ReturnAddr 0x2451d004
    PktWait 0x00000000 WindowPtr 0x00000000
    HomeDir 0x08cd700e Flags 0x00000044
    ExitCode 0x00000000 ExitData 0x00000000
    Arguments 0x2343272c ShellPrivate 0x00000000
    NEW_Alert: MsgPort 0x255b0990
    NEW_Alert: Task 0x2329e360 <Shell Process>
    NEW_Alert: Msg 0x255b0970
    NEW_Alert: AlertMsgPort 0x200c3940
    NEW_Alert: Send Msg
    NEW_Alert: Wait Msg
    NEW_Alert: Replied Msg 0x255b0970
    NEW_Alert: continue


    TLSF_FreePooled also complains about freeing the wrong size of memory by various tasks. I guess the corrupted memory headers are to blame for this.
    This is just like television, only you can see much further.
  • »12.06.14 - 11:14
    Profile Visit Website
  • MorphOS Developer
    itix
    Posts: 1520 from 2003/2/24
    From: Finland
    This doesnt look Wipeout hit. It is just log print which comes when an alert is sent.

    Quote:


    TLSF_FreePooled also complains about freeing the wrong size of memory by various tasks. I guess the corrupted memory headers are to blame for this.


    Tasks as in not related to OpenJK at all or tasks launched by OpenJK? If you are sharing memory pools across tasks you must protect them (MEMF_SEM_PROTECTED) but if not then there is some memory trashing occuring. Or, it could be due to an uninitialized pointer in some data structure. In Linux memory is cleared automatically when it is allocated from the system pool but on Amiga it is not default. This is sometimes source of memory related issues with ports.
    1 + 1 = 3 with very large values of 1
  • »12.06.14 - 11:31
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    OpenJK doesn't start any new tasks, the TLSF_FreePooled warnings come from whatever happens to run at the time. In OpenJK all the memory is allocated either via malloc/calloc or C++ new (the two are not mixed), so this shouldn't be a problem anyway.
    Things start to go south after the game module (jagameppc.so) is unloaded. Is it safe to use dynload.library with C++ code? I use the LibNIX initialization from RTCW, with some comments from the SDK's nixskeleton.
    This is just like television, only you can see much further.
  • »12.06.14 - 12:48
    Profile Visit Website
  • MorphOS Developer
    itix
    Posts: 1520 from 2003/2/24
    From: Finland
    I have never used dynload but I dont think C++ would be anymore unsafe than C code.It is language agnostic. I hope other MorphOS developer familiar with dynload can shed some light...

    By switching to use memory list based memory system (this is option in boot image command line) you could maybe get grip of it. With memory lists corruption is reported rapidly (often small free block follow allocated memory chunk and corruption is detected quickly). With TLSF corruption is more hideous. Wipeout should catch it, though. IIRC there is an option to check pre- and postwall at anytime. By sending a signal (CTRL-E?) or start Wipeout another time... I cant remember right now. (I am travelling ATM.)

    [ Edited by itix 12.06.2014 - 19:58 ]
    1 + 1 = 3 with very large values of 1
  • »12.06.14 - 19:58
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    I tried wipeout with the memory list allocator, and I got the following hits:
    Quote:


    WIPEOUT HIT
    01-Jul-14 09:49:19
    FreePooled(0x236d1b80,0xdeadbee7,0) called
    Name: "Shell process" CLI: "openjo_sp"
    CallerStack[0] 0x24c27d98 at MorphOS:Games/OpenJK/base/jospgameppc.so Hunk 0 Offset 0x00145ac8

    WIPEOUT HIT
    01-Jul-14 09:49:19
    FreePooled(0x236d1b80,0xdeadf005,0) called
    Name: "Shell process" CLI: "openjo_sp"
    CallerStack[0] 0x24c27d98 at MorphOS:Games/OpenJK/base/jospgameppc.so Hunk 0 Offset 0x00145ac8


    That offset belongs to free() in libnix, just as in the other cases. It is called from a destructor of a map, or a list. There's no sign to where this memory address gets corrupted.

    [ Edited by BSzili 01.07.2014 - 15:11 ]
    This is just like television, only you can see much further.
  • »01.07.14 - 14:00
    Profile Visit Website
  • MorphOS Developer
    itix
    Posts: 1520 from 2003/2/24
    From: Finland
    Quote:

    BSzili wrote:
    I tried wipeout with the memory list allocator, and I got the following hits:
    Quote:


    WIPEOUT HIT
    01-Jul-14 09:49:19
    FreePooled(0x236d1b80,0xdeadbee7,0) called
    Name: "Shell process" CLI: "openjo_sp"
    CallerStack[0] 0x24c27d98 at MorphOS:Games/OpenJK/base/jospgameppc.so Hunk 0 Offset 0x00145ac8

    WIPEOUT HIT
    01-Jul-14 09:49:19
    FreePooled(0x236d1b80,0xdeadf005,0) called
    Name: "Shell process" CLI: "openjo_sp"
    CallerStack[0] 0x24c27d98 at MorphOS:Games/OpenJK/base/jospgameppc.so Hunk 0 Offset 0x00145ac8


    That offset belongs to free() in libnix, just as in the other cases.


    You must find out what is calling this free()... it is very likely some destructor as you pointed out earlier but we could find out which one (or was it already determined?).

    Now I cant remember this out of my head but IIRC if you use "STACKLINES=<number>" parameter you get deeper backtrace. Default is 2 so maybe try something like 5 or 10.

    But since you are using shared objects I am starting to wonder if they depend on some constructor code that is not called. When building vanilla C code there are only some very simple constructors to initialize stdio and such. In C++ it is much more complex and with my limited C++ experience I have no clue what kind of constructors are there. I would suppose that some static classes need to be initialized... if objects are loaded dynamically it cant be done automatically because constructors are compiler vendor specific feature.

    Is it possible build everything statically in one go? Or would it take too much memory or need deep changes to source code?
    1 + 1 = 3 with very large values of 1
  • »01.07.14 - 14:18
    Profile
  • MorphOS Developer
    itix
    Posts: 1520 from 2003/2/24
    From: Finland
    Oh, from Wipeout documentation:

    Quote:


    4) Memory is filled with 0xDEADBEEF before it is freed, encouraging programs
    reusing freed memory to crash. See the configuration options REUSE and
    NOREUSE for more information on this subject.




    So in this case it looks like it is reading pointers from memory that is already freed.

    FreePooled(0x236d1b80,0xdeadbee7,0) called
    FreePooled(0x236d1b80,0xdeadf005,0) called

    So it is something like

    free(ptr)
    free(ptr->another_ptr);

    although pointers are not exactly 0xdeadbeef but some offset added to it.
    1 + 1 = 3 with very large values of 1
  • »01.07.14 - 14:25
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    I forgot to mention this is all C++. According to the non-wipeout log, it happens while destructing this class:
    https://github.com/BSzili/OpenJK/blob/master/codeJK2/game/g_navigator.h#L130

    There's probably no way to link the game module to the main executable. I remember trying something like that before with an another id tech game, but that resulted in other weird crashes, because many globals didn't get re-initialized during the level change.
    This is just like television, only you can see much further.
  • »01.07.14 - 14:48
    Profile Visit Website
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    As a last resort, I'm thinking of ditching dynload.library in favor of my DLL loader. It works similarly to FPSE's one, and is loosely based on the one found in the Amiga port of Heretic 2 and Quake 2. The biggest advantage is loading the "shared objects" as regular executables, which means all the C library initializations and C++ constructors/destructors are taken care of.
    This is just like television, only you can see much further.
  • »23.07.14 - 10:48
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    BSzili wrote:
    As a last resort, I'm thinking of ditching dynload.library in favor of my DLL loader. It works similarly to FPSE's one, and is loosely based on the one found in the Amiga port of Heretic 2 and Quake 2. The biggest advantage is loading the "shared objects" as regular executables, which means all the C library initializations and C++ constructors/destructors are taken care of.


    FYI dynload.library handles .c/dtors on dlopen/close(), so like itix observed your problem is likely to be that something already freed stuff that is attempted to be freed in the destructor, could be worth adding some debug there.


    - CISC
  • »23.07.14 - 12:26
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    I thought dynload.library requires manual C library initialization, at least that is how RTCW loads its game modules. I already linked the affected code, and the crash happens when the "automatic" destructor frees a class member. Where am I supposed to add that debug code for that?
    This is just like television, only you can see much further.
  • »23.07.14 - 12:40
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    BSzili wrote:
    I thought dynload.library requires manual C library initialization, at least that is how RTCW loads its game modules. I already linked the affected code, and the crash happens when the "automatic" destructor frees a class member. Where am I supposed to add that debug code for that?


    C library de/constructors are different, they go in .ctdt, so indeed you have to handle those manually, but C++ class de/constructors go in .c/dtors. Make sure you are not double-handling .c/dtors as that would be an obvious culprit. :)

    Just add some debug in ~CNavigator(), then atleast you will be able to see the state of the class right before it is freed.


    - CISC
  • »23.07.14 - 13:05
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    Ach so, thanks for the clarification. I'm not handling ctors/dtors manually, only ctdt: https://github.com/BSzili/OpenJK/blob/amiga/code/aros/libnix_so.c
    I guess the dtors are only called by dynload.library's dlclose. What kind of debug info should I print? The addresses of the class members?
    This is just like television, only you can see much further.
  • »23.07.14 - 14:14
    Profile Visit Website
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    Georg
    Posts: 111 from 2004/4/7
    Quote:

    BSzili wrote:
    https://github.com/BSzili/OpenJK/blob/amiga/code/aros/libnix_so.c



    Isn't the order in which this constructor/destructor stuff gets called, important?CallFuncArray() in that code always goes forward through the array. Maybe it needs to go backwards sometimes depending on whether it's constructor or destructor array.
  • »23.07.14 - 16:44
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    Those are just the C constructors, and it's from the MorphOS SDK nixskeleton. I'm having trouble with the C++ constructors.
    This is just like television, only you can see much further.
  • »23.07.14 - 16:51
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    BSzili wrote:
    Ach so, thanks for the clarification. I'm not handling ctors/dtors manually, only ctdt: https://github.com/BSzili/OpenJK/blob/amiga/code/aros/libnix_so.c
    I guess the dtors are only called by dynload.library's dlclose. What kind of debug info should I print? The addresses of the class members?


    For example, anything that can clue you in what's going on in the crashdump.

    Quote:

    Georg wrote:
    Isn't the order in which this constructor/destructor stuff gets called, important?CallFuncArray() in that code always goes forward through the array. Maybe it needs to go backwards sometimes depending on whether it's constructor or destructor array.


    Look closer; .ctdt array is priority-sorted before functions are called, besides the issue is not in .ctdt which only contains de/constructors created from constructor.h (usually only libnix stuff), but rather .c/dtors which are compiler-created de/constructors.


    - CISC
  • »24.07.14 - 05:11
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    I already know what's going on in the crashdump. One of those class members has a bogus address when it's destroyed. I just can't figure out why. I honestly don't know what should I print in the destructor to point me in the right direction. This only happens on MorphOS, none of the other supported platforms segfaults/crashes here, so I'm on my own with this problem.
    This is just like television, only you can see much further.
  • »24.07.14 - 06:44
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    BSzili wrote:
    I already know what's going on in the crashdump. One of those class members has a bogus address when it's destroyed. I just can't figure out why. I honestly don't know what should I print in the destructor to point me in the right direction. This only happens on MorphOS, none of the other supported platforms segfaults/crashes here, so I'm on my own with this problem.


    Well, then print all the member pointers, then at least you will see if it happens outside your code control or not. You might also want to print them at certain places in your main code as well, certainly right before the hits usually start, just add another class method that does that for you.


    - CISC
  • »24.07.14 - 07:51
    Profile
  • MorphOS Developer
    itix
    Posts: 1520 from 2003/2/24
    From: Finland
    There is bogus address because this class was already disposed and Wipeout erased its contents with 0xdeadbeef. So it is being disposed twice which should not happen if dead pointer was eliminated properly.

    What I would try is print out class object address and trigger an illegal write to 0x00000000 to get a stack trace. (IIRC there might be a system call to get stack trace but I am not sure ATM.)
    1 + 1 = 3 with very large values of 1
  • »24.07.14 - 08:16
    Profile
  • MorphOS Developer
    bigfoot
    Posts: 510 from 2003/4/11
    Quote:

    itix wrote:
    (IIRC there might be a system call to get stack trace but I am not sure ATM.)



    DumpTaskState(FindTask(0));
    I rarely log in to MorphZone which means that I often miss private messages sent on here. If you wish to contact me, please email me at [username]@asgaard.morphos-team.net, where [username] is my username here on MorphZone.
  • »24.07.14 - 08:29
    Profile Visit Website
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    This is the sole instance of CNavigator, it's essentially a singleton: https://github.com/BSzili/OpenJK/blob/master/codeJK2/game/g_nav.cpp#L28
    The hits always happen after dlclose() when this is destroyed. I still think this is only a symptom of something else messing up the memory, hence the FreePooled warnings about the wrong size being freed up.

    Anyway, I'll first try my own DLL loader to see if it's any better. If not then I can start printing the address of class members, etc.
    This is just like television, only you can see much further.
  • »24.07.14 - 11:35
    Profile Visit Website
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    It looks like my huch was right. After replacing dynload.library with my own loader, the crashes on dlclose() disappeared. I kept loading saves and maps so that the game module is reloaded, and I had no crashes. I'll do some more playing testing, and if nothing comes up, I'll publish new executables for Jedi Academy and Jedi Outcast. Thanks for everyone who helped me (without Wipeout I couldn't have pinpointed the place of the crash), and those who have waited patiently :-)
    This is just like television, only you can see much further.
  • »25.07.14 - 09:10
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    BSzili wrote:
    It looks like my huch was right. After replacing dynload.library with my own loader, the crashes on dlclose() disappeared. I kept loading saves and maps so that the game module is reloaded, and I had no crashes. I'll do some more playing testing, and if nothing comes up, I'll publish new executables for Jedi Academy and Jedi Outcast. Thanks for everyone who helped me (without Wipeout I couldn't have pinpointed the place of the crash), and those who have waited patiently :-)


    But does your loader go through .dtors on dlclose()?


    - CISC
  • »25.07.14 - 09:13
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    The "shared objects" in my case are plain executables loaded with SystemTags.
    This is just like television, only you can see much further.
  • »25.07.14 - 09:49
    Profile Visit Website
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    BSzili
    Posts: 559 from 2012/6/8
    From: Hungary
    The updated Jedi Academy and Jedi Outcast executables are now available on my website. The single player should no longer crash on module reloads, and I fixed tons of multiplayer issues, like the text input.
    This is just like television, only you can see much further.
  • »25.07.14 - 18:27
    Profile Visit Website