-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building KLH10 on MacOS Ventura fails #2270
Comments
When I go and step through the mark.tcl script manually by running
|
@eswenson1, are you running KLH10 on a Mac? If so, anything to add here with regards to your success or lack thereof? |
I'm not currently. Let me try to build and run. Update: I did just try to do a
I tried repeatedly downloading klh10.tgz from hactrn.kostersitz.com, and each time, while the download appears to work perfectly fine, the .tgz has errors extracting. Perhaps it was not properly uploaded the last time? I'll try a full build, which, of course, is needed to test out this ticket. I was just curious to try the |
I'm failing in the same way as previously described:
|
I manually tried to start KLH10 (in the right directory with the right command-line parameters), and got this:
It hung at this point. So either there is something wrong with the files loaded into KLH10 (e.g. dskdmp.216bin or @.ddt-u), which I doubt, or the built KLH10 doesn't work any more when built on mac. I'll see if I can find an old klh10 I've built and run to see if it does better. |
When I start my full build manually with this |
Is it normal to get this error on startup of kn10-ks-its? I don't recall seeing it before.
I wonder if this is the cause? |
I'm getting the same issue as you are:
I suspect the inability to allocate memory (first error message on kn10-ks-its startup) is the cause. |
I guess that would be a problem. |
I am wondering if this has to do with the new memory protection features in MacOS. For example I had to codesign gdb to actually use it to get it to run and start debugging |
I dunno. I had to codesign gdb way before this problem started happening. |
Found this in the klh10 install.txt
maybe this is a red herring? |
The code for this lives in |
For some reason, I cannot run my old version of kn10-ks-its under gdb. So I'm unable to debug. I get this:
I'm hung after the [New Thread...] message from gdb. I never see any messages from kn10-ks-its. The "mark^[g" was a silly/vain attempt to see if it had started without messages and NSALV was waiting for input (it wasn't). |
I wonder if shared memory settings need to get updated. This is what I have:
|
I increased kern.sysv.shmmax to double that amount and I still get a failure. But the value emitted in the error message is still 4194304. I wonder if I have to reboot the system after changing the value? I thought you could simply do
So my change appears to take effect. Not sure why kn10-ks-is asking for 4194304. Can you tell from the code what it wants? |
It does look like kn10-ks-its is only wanting 4M, so the 4194304 value is what it is asking for. My "shmmax" value is double that, so the shmget call should succeed. Not sure why it isn't. Googling..... |
I found that once it crashes in the terminal window I have to start a new terminal session for the emulator to get to the prompt again. Something in that terminal session gets borked after the crash |
Can you use gdb to determine what the error code from the shmget failure is? That might provide a clue. |
For me, creating a new shell doesn't fix my inability to run kn10-ks-its under gdb. I never get to see any output from kn10-ks-its after gdb reports that a new thread is created. |
I may have to rebuild my kn10-ks-its (good idea anyway). I was able to attach to my kn10-ks-its process from gdb after it was started and see this:
I'll rebuild and retry. |
the memsiz is calculated in klh10.c |
this is what I get in gdb
|
Need to do a CONTINUE in GDB after attaching so it takes the input on the emulator side. |
Yeah, I did. I got:
and in the gdb session:
So no help. |
At least we are both seeing the same. |
I do not think it has to do with the amount of memory the emulator tries to allocate I changed it to just try 50% of the allocation. No change in behavior. |
What is the error code you’re getting from shmget? |
Did you run MARK$G? |
GIT LFS can deal with the large binary files issue, though it's hardly the only solution to keeping sane copies of such things around. |
All the missing info I needed is resolved now, the input sequence being:
|
So just FWIW, I can reproduce this now ... lldb will though not do anything useful (it is looping or something, and not accepting input to DDT so you cannot input |
When I run under lldb on an M2 Mac, I get this:
So in addition to the shmget failure (which we've noted already), I'm getting errors setting up dsk0 too. Are others seeing this? And yes, I do have an rp0.dsk file at ../../out/klh10/rp0.dsk. |
Doesn't klh10 do disk i/o in subprocesses? If so, and it can't establish shared memory, that seems like the same kaboom. |
Two thoughts:
|
I think I ran ipcs and saw none created. My shmmax value is double the 4M that KLH10 is requesting. Someone (Mike) already tried reducing the amount of requested and it still fails. However Mike said that if the shared memory request fails, the code tries local memory — which appears to succeed, and KLH10 continues. However, for me, KLH10 can’t setup access to the disk (rp0.dsk) which may well be why it segfault a on first disk write. |
I see in the output and in the source where it falls back to local memory for general purposes, but I'm having trouble finding a similar thing for the RPXX. |
Not sure what I broke, but I can no longer build klh10 on my M2 Mac. I get linker errors:
Anyone have a clue why this might happen? It seems that it is only linking one object file (klh10.o). How do I find out how it is invoking ld? I tried adding to the command line, but that didn't help. |
Where does |
This thread is getting unwieldy though ..
|
I suspect it is defined by KLH10 in one of its sources and somehow my make is only linking the one object file. Probably screwed up adding “-g -Oo” to the compile phase. |
I've tracked down the cause of the
In other words, the shmget failure directly results in the disk initialization (and IMP initialization) failure. So we need to get to the bottom of the shmget failure. Note that there are two shmget failures -- the one reported earlier, where klh10 retries with local memory, and THIS shmget failure, where there is no local memory retry and where the failure directly causes a disk initialization failure. Also: when I run under macOS on my M2 mac, I don't get errno = 11 in my two shmget failures, but rather errno = 12 -- Cannot allocate memory. |
Ok, I resolved the shmget error. I did this:
This will change these settings for the current bootload. I'm not sure how to make the change permanent on macOS (there is no /etc/sysctl.conf file). |
In order to make the changes permanent, you have to create a PLIST. See this article for instructions: https://arc.net/l/quote/hghlubid |
Can we do a similar retry to get local memory there? |
Probably not. The main process communicates with the disk and network subprocesses through shared memory. |
I did manage to complete the ITS build and run the resulting system successfully after I increased the shared memory limits. Not really sure why the defaults weren't sufficient -- since we're trying to allocate an amount equal to the default limit, but I suspect that there is already some shared memory allocated, and therefore the amount requested by kn10-ks-its exceeds the total limit of the system. I had, at least one one machine, tried simply to double the maximum value -- and that didn't work. Allocated 16 times as much as the default did work, so I guess I should figure out the minimum value. That value, however, might be different for each person depending on the amount of shared memory the existing running programs are consuming. Also note that updating smhmax should be accompanied by a corresponding, scaled, value for shmall. |
So it turns out that you don't need to set the shared memory limit to as much as 64MB. 32MB works fine too. 16MB doesn't, however, and the default of 4MB, of course, doesn't work either. So the two commands to manually update the shared memory settings you need are:
To make these changes permanent, create the file /Library/LaunchDaemons/sysctl.plist with the following contents:
And then make sure it will get run on system boot by invoking:
|
We should add these instructions to the KLH10 documentation and to the ITS build documentation. @larsbrinkhoff: do you agree, and if so, where should this go? And we should say that macOS Ventura and later will need this fix. Earlier releases appear not to. |
A bit more info -- since people are wondering (on IRC) why we need this raised limit. The
After starting klh10 (as root, for network reasons), I see this:
Those entries for root are those for klh10. As you can see, we are allocating 4 shared memory regions. The sizes are: 4488, 4194304, 5416, and 4016. All told, that is 4,208,224 bytes, which is slightly over 4MB. Since the sysctl shared memory limits are per-system (all users), clearly, 4MB isn't enough. 16Mb isn't enough either due to the other processes using shared memory as well. However, I think there may be an issue with shared memory freeing in klh10. All those entries, for eswenson, in the ipcs output above do NOT correspond to processes that still exist. I suspect these are the old klh10 allocations that are not getting freed properly. Perhaps only on error exits. I'll have to wait until I can logout and log back in again, or reboot, (have too many active work-related things going on on my machine to reboot now). Then I'll check the shared memory segments to see if there are any allocated. Then I'll experiment with klh10 to see if they persist after various conditions. In any case, the default 4MB allocation size is NOT enough to allow a single KLH10 instance to start under Ventura or later. It MAY be that we don't need to go as high as 32MB -- that the only reason I needed 32MB was because some of the other shared memory segments were not freed when klh10 bombs out. More experimentation is needed. |
NATTCH of 0 seems diagnostic. If there are none after boot, it might be informative to see if emulator crash leaves different debris vs clean exit. If the above is typical, then a bit over 20 MB is probably enough. I don't recall the cost of raising the limit; it may or may not be worth getting too fancy with the instructions. |
FWIW: I tried the download and unpack from the link multiple times and it comes across fine. both on my Fiber and mobile 5G connection |
Good catch on that. Seems that all of those shared memory segments are detritus. And yes, we probably should cite (in the documentation/prerequisites) a real minimum. I'll try this out on a cleanly booted session and try to come up with a minimum value. But yes, 20MB is probably sufficient.. |
Yes, I agree instructions should be added. I think most of it should go in the KLH10 repository. Readme and doc update, and possibly some script that a user can run. Then the ITS repository could refer to that, and offer to run the script. |
Can you upload it some place? |
I created the plist file and ran the launchctl command as specified. The suggested command for "richer errors" complained with a Usage message. |
Any chance this is about System Integrity Proteciton? |
I'm now getting the same error as @rmaldersoniii. However, this may be because the sysctl plist is already loaded. I did this:
which shows that it is present. And then I did a:
and didn't get any errors. So I'd recommend doing the same two commands. And then running:
to see if you already have the two settings. Doing this after a reboot, of course, would confirm that the PLIST was executed on boot. It is possible that you need to enable this daemon as well:
And you can get detailed info on the daemon with:
This should provide you status information about the daemon and indicate success/failure of running it. |
Logging an issue to keep track of the work here.
The build fails really quickly at the start
The text was updated successfully, but these errors were encountered: