KStars is crashing, you might have memory leak. It is not obvious KStars is crashing because you are out of memory.
My system is quite different, not running Astroberry, 8GB memory and I am allowing the system to page to a swap file. But last week I ran a simple test using simulators taking 10K 1 second images and storing the .fits files.
KStars is a substantial program, on my system just loading it without using Ekos or drivers.
Getting through initial setup and imaging sessions will obviously require more resources non of which can be considered a memory leak. So I started gathering memory info after 200 images to allow the software to establish itself.
KStars Process going from 200 to 8,000 images.
# images: 200 4K 8K Growth%
VM GB: 2.072.082.112%
RES MB: 35136439312%
MEM%: 4.3% 4.5% 4.9% 0.6%
Used GB: 184.108.40.206
Avail GB: 220.127.116.11
Swap MB: 457792
The KStars/INDI had around 8.5K files/sockets open, this remained fairly consistent during test. So it did continue to grow but not alarmingly so. After 8K of image files, nothing that would stress the systems ability to mange the memory. It did use some swap space but just at a housekeeping level.
I want to suggest something about the scripting you are doing and hoping I don't have too many wrong assumptions.
We see the operating system using a good portion of its memory as a disk cache for performance reasons. I believe that if it needs more memory for resident programs it will trade off that cache memory to satisfy the needs of the requesting program. That is the memory identified as available. When you force out of the cache the clean pages stored there, presumably you are moving the memory from being available to being free. I do not think you are making new memory available to KStars. It would seem that the operating system will get busy establishing its cache again. It think it would be better to leave management of the cache up to the OS. If there are cached pages and some process needs more memory, the OS should be able to free up those cached pages itself.
I am thinking the see sawing shown in the graph might be the OS is setting up the cache and the script undoing that.
Below from your initial post where you show the transition from pre to post crash, I think this is the area referred to:
2022-05-16 18:28:13; Temp=43.3ºC. RAM 'Used': 1.2GB (31.4%); RAM 'Free': 104MB ( 2.8%); RAM 'Available': 2.4GB (64.0%); RAM used in Shared & Buffer: 66.7%. Swap used:100.0%. µSD @/dev/root uses 39,604,448kB.
2022-05-16 18:28:15; Temp=42.8ºC. RAM 'Used': 1.0GB (24.4%); RAM 'Free': 489MB (17.1%); RAM 'Available': 3.1GB (83.1%); RAM used in Shared & Buffer: 66.8%. Swap used:98.8%. µSD @/dev/root uses 39,607,008kB.
2022-05-16 18:28:16; Temp=42.3ºC. RAM 'Used': 459MB (12.1%); RAM 'Free': 847MB (22.4%); RAM 'Available': 3.1GB (83.8%); RAM used in Shared & Buffer: 65.7%. Swap used:98.8%. µSD @/dev/root uses 39,607,008kB.
If the idea that the OS will make "available" memory available as needed we see before the crash 2.4GB might still be accessable. This strongly suggests that the crash is not due to out of memory errors. Perhaps if you posted a log file somewhere that would show otherwise.
About the swap file Which is the only thing I can see that is maxed out.
What you are running is a kind of stress test, a large software application creating thousands of images on what today is considered constrained hardware. At the same time because of the very small swap space it is in a state where this test has to run without being able to trade off unused or less used pages in memory because it not being allowed to page the files onto disk. At one time I ran on a Raspberry pi and also a Intel PC computer. In both cases I added a USB or better to run the software from and use for the swap. Certainly would not want to run the swap on a SSD card.
We know from its use of the swap file that the system is trying to page files, perhaps to get the older less used pages out of its cache. On my larger system it with memory to spare it wants to use the page file. I am not trying to make a case that the lack of swap space is causing a crash. Also not trying to say KStars is being closed because the OS has stared killing processes I thought it only did that to save itself when unable to satisfy memory requests.
Unless it has got itself into a state where not being able to page makes it unable to turn availalbe memory back into free.
Do wonder if you gave it 2GB or even 1GB page file if it would make a difference. It is tough if you only have a small SSD card on the system.
That is the last time, promise.
Mach1, TS86SDQ, ASI071, ASI174, OAG, focusPro
Last edit: 3 months 3 weeks ago by wotalota. Reason: Format
Using memory as file cache is fine and to be expected/wanted. Paging out things that have not been used for a ‘long’ time is fine (i would think this is data rather than executable but could be wrong) I think I’ve even changed a setting on my astroberry to make it keep things in memory more rather than swap them out.
The problem is more why is kstars dying, is it a problem with the code, some linux setting or a bug in some part of linux. It could just be coincidence and it’s not actually the memory/swap that’s run out in some way. Really there’s a need to look at some core dump or trace to see what was happening when it failed. Pete seems the only one with the patience to take thousands of photos and it sounds like it takes a while to do. Unfortunately I don’t know enough to be able to advise how to get that dump, ages since I’ve looked at such a thing..
Yep that’s the setting I rather randomly changed
I also use zram/zswap for swap and even used tmpfs for /tmp (though ran into problems due to temporary files building up when doing plate solving). This all on a 4gb PI. I don’t think I’ve run out of ram and only ever had minimal swap usage so kstars/ekos isn’t exactly eating all the memory. That said I’ve only gotten things all working recently so have only taken around a hundred images in a session.
I used to have more problems when I used to take larger resolution images (with the same camera ) but stopped using the fits viewer all the time, switched off the notification sounds, changed to aps-c resolution and applied lots of updates which seems to have ‘cured’ that.No idea what it was though as it was quite random