×

INDI Library v2.0.7 is Released (01 Apr 2024)

Bi-monthly release with minor bug fixes and improvements

Driver OnStep (LX200 like) for INDI

  • Posts: 322
  • Thank you received: 31
> getCommandSingleCharResponse is sometimes a replacement
> for getCommandString, but only if it lacks a trailing #

But the GX98 returns an ASCII letter, # terminated?

For the use case where the return is a Boolean, then why not rename
the function to what it really expects: getCommandBooleanResponse.

> all uses now in network-timeout no longer have that, and are
> noted as such.)

I am concerned that all this is not tested over WiFi. We can't submit
to Jasem only to have the next packages show bugs.

If it was up to me, I would submit only Alain's change of GX98 because
we are sure that it fixes the rotator issue (the urgent matter at hand
with SWS), and it is small enough to be easy to understand.

Later, you can do the tidying up and error checking on a future round.
That is, unless they have adequate testing before submission
upstream.
2 years 5 months ago #77698

Please Log in or Create an account to join the conversation.

  • Posts: 161
  • Thank you received: 39
Sometimes I really hate the INDI forum (I typed up a response, then misclicked. Here's v2)

Regarding SingleChar vs longer, some things return _ somethings return _# Often it's a letter (w #)/number (commandError) thing, but not always. I made a mistake with the :GX98# so I went through and verified it for all the others. Honestly, if they were all #, then that would be good. Unfortunately, that's not the case, and some will do either (This is the reason for the existence of getCommandSingleCharErrorOrLongResponse) which isn't handled well by getCommandString (5 sec timeout caused on the error response each time.) With different versions of OnStep, that was a major contributor to the very long startup times. getCommandString does include return so we can do error checking for connections with it as well, if we keep using it anywhere, but we've got to wrap that anyway. Or we could try modifying getCommandString, which is an option I don't really like.

Short version: SWS will have spikes, cause could be something else, but after running for an hour, mine crashed on a command that was otherwise 100% fine, but because there's not error checking, boom. That's with longer timeouts. Sometimes wireless networks timeout. That shouldn't crash the program. It was probably responsible for very occasional crashes I never was able to reproduce with the addon wifi.

Boolean is bad because there's not a good way to check if there was an error. I mean we can add another variable error number as an argument, and check that.It'd be nice to return multiple things (boolean, error), but that'd require a struct, and defeat the purpose of a boolean, just as adding another variable would you'd have to do something like this:

Right now there's a batch that are like this (or expecting a 0) which do need to be fixed.
if (getboolean(normalcalls) == 1)
{ do_whatever }

For a boolean that also does error checking we get roughly this
bool bool_ret = function(normal_calls, error);
if (error = good)
{if (bool_ret)
{ do_whatever }
} else {
handle_error
}

To compare that to how the Single/Long Response is now, it saves very little.
int int_ret = function(normal_calls, response);
if (int_ret > 0) //sometimes >=
{ if response = xyz
{ do_whatever }
} else {
handle_error
}

It's not as pretty, but it won't crash. And honestly, it makes programming/debugging things easier for me, when you can just see the response. Rather than in some case it being if(!sendCommand) which returns 0 or 1, and sometimes 0 is good, sometimes 1 is. Because the the check is == 0, 0 returns 1/true and 1 returns 0/false. Then negation of that in roughly half the times it's used, well sometimes I get bitflipped going between code and logs showing response. (roughly half the calls are like that, so we also can't just go false == error)

Anyway, That's my thoughts on it. We probably should do bounds checking and make sure the response is as expected, if it's a number, but I think that usually if we get a response it will be the correct one.

Unfortunately, in a lot of cases, terse easy to read code that works under some situations == crashy code, when something else is introduced. That's this when the possibility that any particular command may fail. So we should handle it, say we don't care about wireless (which would be a stupid thing to do IMO.) or it's just going to crash occasionally. (Which to be fair, at this moment, I am saying since we rely on lx200, but I do hope to get that to a state where that can be eliminated. If we can eliminate half of them, that's also good IMO.)

Worries about touching the code, yeah, That's why it's tested. Also why I'll probably setup an alt-az tester with wifi at some point. (I need to make some things for my Dob anyway. But that's at least a little bit away.

Sorry if that's a bit verbose. Hopefully that makes sense. Actually thinking maybe it's not a bad idea to modify getCommandString, and a variable timeout for all the lx200 stuff. I do think there was a reason why not that I'm forgetting at the moment. I should look at that again.
2 years 5 months ago #77709

Please Log in or Create an account to join the conversation.

  • Posts: 322
  • Thank you received: 31
I realize that managing the different versions and all the devices is quite complex.
So again, thank you so much for putting in all the effort.

What I meant by getCommandBoolean is handling commands that return only 1 or 0 without the trailing hash.

There are many of them here:

onstep.groups.io/g/main/wiki/23755

We can't change them to something else, so they are what they are.

:FA and :fA are examples of these boolean returning functions, where a single character is returned and checked.

I tested again with Alain's change (just GX98) but reverted the timeout value in SWS to the default of 200 ms (I lowered it to 100 seconds a few days ago), and it works as advertised.

Maybe a value of 300 ms or 400 ms for timeouts would be a good catch all? I base that on USB being much faster, and the old WiFi having a default timeout of 60 ms (vs. 200 ms in SWS), so 400 or so should be plenty. Maybe some commands will be slower, and those may be given a longer timeout.

GX98 (is the rotator there), FA (is the first focuser there) and fA (is the second focuser there) all return within 50 ms or so, as they should. And if the code acts on them (i.e. does not send any more :F, :f or :r commands), then alll will be good (at least for focusers and rotator detection).

My startup is exactly 3 seconds (no rotator, no focuser).

After I typed all this, I did get a crash long after I started INDI successfully. Nothing special being done, the mount is at home and not tracking. So that crash should not stop the GX98 change going in.

I tested again slewing west then east, and all is normal.

Attached is a diff file of Alain's changes against the current master of Jasem's PPA.
2 years 5 months ago #77729
Attachments:

Please Log in or Create an account to join the conversation.

  • Posts: 322
  • Thank you received: 31
Here is another puzzle.
After doing some testing, and then telling the mount to move back home, the crosshairs were wrong.
They were below the horizon towards the east.

Here is what INDI was showing just before it crashed.

i.imgur.com/w8meCjP.png

At the same point, querying the mount from a python script showed the proper Alt and Az:

kbahey@tilapia:~/projects/OnStep/onstep-python$ examples/test-timeouts.py
0.040 GR 11:13:36#
0.070 GD +90*00:00#
0.050 GZ 000*00:00#
0.049 GA +43*25:07#

So something in INDI is messed up.

Upon restarting KStars and INDI, the RA/DEC were correct and incrementing.

i.imgur.com/Unx0hMD.png

So something else is going on.
2 years 5 months ago #77730

Please Log in or Create an account to join the conversation.

  • Posts: 322
  • Thank you received: 31
That last problem of wrong pointing, and probably the crash, has to do with parking.

I pressed "Purge data" from Site Management -> Park Options, slewed to Arcturus, then Altair, then told it Return Home, and all worked as it should.

So there is a separate issue that is related to parking.
2 years 5 months ago #77732

Please Log in or Create an account to join the conversation.

  • Posts: 322
  • Thank you received: 31
More iterations of testing with "return home" after Park data was purged, show everything working normally. No crashes.

So it has something to do with Parking, which is not part of my actual workflow since I don't have an observatory.
2 years 5 months ago #77734

Please Log in or Create an account to join the conversation.

  • Posts: 452
  • Thank you received: 71
Khalid,

I tried the "Purge" methos but still have the jumps when Parking / Unparking.

In the meantime I did try to start a list of all the LX200 commands github.com/azwing/OnStep_Commands
It is really a start but more complete that the Wiki and IO still miss a bunch of commands.
Since I am blocked for hardware reason ...
2 years 5 months ago #77741

Please Log in or Create an account to join the conversation.

  • Posts: 322
  • Thank you received: 31
After I purged, I never parked. Just "return home", and everything was normal.

So my scenario is different that yours.

There is still a bug that happens after parking (wrong coordinates reported in KStars, but not in other client applications) and crashing.

I think the crashing is related to the wrong coordinates somehow.

The puzzle is what happens after parking that triggers all this. It is not parking itself, but something following or relating to it.
2 years 5 months ago #77743

Please Log in or Create an account to join the conversation.

  • Posts: 148
  • Thank you received: 19
Khalid - can you crash at will? if so humour me here - as you know I have a number of systems and under constant settings and conditions I randomly crash...BUT I have found if I turn off logging completely I do not randomly crash - Your crash on park may not be the park process but LOGGING about the park process - every crash I get is related somehow to Q lib(either io or audio) so my suspicion is that the the values used in motion and parking blow up when being logged - if you can replicate you crash, try a few time with no logging turned on - it would almost verify my theory....I am too random to verify
2 years 5 months ago #77744

Please Log in or Create an account to join the conversation.

  • Posts: 322
  • Thank you received: 31
Not really 'at will'. It takes some 5 minutes or so after I park, and after the coordinates are different from what OnStep actually reports, then it crashes.

I will park and see if it happens again.
2 years 5 months ago #77745

Please Log in or Create an account to join the conversation.

  • Posts: 322
  • Thank you received: 31
My log verbosity is 'regular'.

The park function works normally initially (except for that jump that Alain reported. I see it too).
There is no crash (at least for a few minutes).

Then I unparked, then slew to the other side of the meridian. All normal.

Then issued a park command and waited for maybe 10 minutes and it did not happen.

Now that I try to remember, the crash happened the next day from trying to park. Maybe because the position was under the horizon (or so INDI thought it was).

Will leave it like that, and test tomorrow.

Sorry, nothing definitive here.
2 years 5 months ago #77748

Please Log in or Create an account to join the conversation.

  • Posts: 452
  • Thank you received: 71
Khalid,

I try to setup different configuration with the hardware I have available (Bluepill / Arduino-Mega / Fysetc S6 / Max PCB2)
Not thaty I beleive the errors come from different platforms, it is just because this is what I have to set-up different scenarii.
(With / Without Focuser, Rotator, Equatorial vs Altaz ...)

I could not do all since I have still no soldering iron, but was able to set-up an Arduino Mega 2560 with flying wires.

just to see if my thoughts are correct:
In therory:
1) -If I connect to Serial (via usb-serial dongle) dedicated to WiFi I should be able to connect with Indi or terminal and send commands / receive responses, correct?
2) - If I use an Arduino Mega 2560 I connect my ESP8266 Rx/Tx to serial 1, correct?

If I try (1) I still have sometimes errors that I don't have via standard USB
If I try (2) I have some other errors: (with kstars of with python script)
a) I cannot connect at all
b) I can connect but after a while all is blocked
c) I have the message "Serial Interface to OnStep is Down!" in the Web browser
Last edit: 2 years 4 months ago by Alain Zwingelstein.
2 years 4 months ago #77768

Please Log in or Create an account to join the conversation.

Time to create page: 0.742 seconds