BUG: scintilla.CurrentPos and scintilla.Caret.Position bugged

Aug 31, 2013 at 8:07 PM
Edited Aug 31, 2013 at 8:07 PM
Hello,

at the moment I have really hard problems with the Scintilla natives scintilla.CurrentPos and/or scintilla.Caret.Position which both don't work correctly.

If you just display the current position in an extra textbox while typing you will notice that it is heavily bugged and can't be used reliable.

If I type one letter, then a space the position is correct. If I delete that space now with via backspace, the position remains the same although it has changed.

Even worse happens if there are special chars inside the text of the Scintilla Control, like for example 'ü', 'ä' or 'ö' but also 'ß' or '§' (or many more) then the current position gets incremented by two for each of those special chars, although they only count as one!

This also happens when those chars are within comments. As a result it is easily possible (and very likely) that such a simple thing like:
char[] textArray = scintilla1.Text.ToCharArray();
char actualChar = textArray[scintilla1.CurrentPos];
will cause an ArrayOutOfBounds Exception.

Anyone knows how to solve this problem? It is really annoying and makes it basically impossible to use the CurrentPos in a reliable way.

Regards,
2mQ
Coordinator
Sep 2, 2013 at 3:23 AM
Everything you are describing can easily be explained....

What event are you using to trigger the update of your cursor position text box? You weren't specific... but you would definitely need to be using one. The CurrentPos and/or Caret.Position properties aren't going to notify you of changes, and as a result, the value displayed there could be stale. The NativeScintilla.UpdateUI event is the correct one to be using. It is for this express purpose. The Scintilla documentation describes the event as "Either the text or styling of the document has changed or the selection range or scroll position has changed. Now would be a good time to update any container UI elements that depend on document or view state." Make sure you're using that event to trigger any updates to your position display.

As for the position value being off when you use Unicode characters such as ü', 'ä' or 'ö'... that is because those properties tell you the current byte offset, not the character offset. As you may know Unicode characters can often require more than one byte of storage. This problem is a limitation of the native Scintilla component and has been well documented in these forums. https://www.google.com/#q=Unicode+site%3Ahttp%3A%2F%2Fscintillanet.codeplex.com%2Fdiscussions%2F

Instead, try the NativeScintilla.GetColumn method. It's probably what you're looking for. Per the Scintilla documentation, "returns the column number of a position pos within the document taking the width of tabs into account. This returns the column number of the last tab on the line before pos, plus the number of characters between the last tab and pos. If there are no tab characters on the line, the return value is the number of characters up to the position on the line. In both cases, double byte characters count as a single character." (emphasis on the last line)

If that's not what you're looking for, then try some of the other suggestions in the forums to translate byte offsets to character offsets.

At this point, hopefully the bells are starting to go off and I won't need to explain why your code sample could throw an exception.


Jacob
Sep 2, 2013 at 4:07 PM
Edited Sep 2, 2013 at 4:13 PM
Unfortunatly that explanations don't help much.

jacobslusser wrote:
Everything you are describing can easily be explained....

What event are you using to trigger the update of your cursor position text box? You weren't specific... but you would definitely need to be using one. The CurrentPos and/or Caret.Position properties aren't going to notify you of changes, and as a result, the value displayed there could be stale.
I wasn't specific on the event because it shouldn't matter. As long as the event reacts in an approproite way (namely after the TextChange happened), the CurrentPos should be correct. But to specify this, I used both the TextChanged and the SelectionChanged Event as well as the suggested NativeScintilla.UpdateUI Event. Both the TextChanged and SelectionChanged Events do get fired.

Especially annoying is the fact that if I type in a space (which triggers the TextChanged Event in any case) it only doesn't work on the first space after a letter - after that it does work, so this can only be a bug.

jacobslusser wrote:
As for the position value being off when you use Unicode characters such as ü', 'ä' or 'ö'... that is because those properties tell you the current byte offset, not the character offset. As you may know Unicode characters can often require more than one byte of storage. This problem is a limitation of the native Scintilla component and has been well documented in these forums. https://www.google.com/#q=Unicode+site%3Ahttp%3A%2F%2Fscintillanet.codeplex.com%2Fdiscussions%2F


At this point, hopefully the bells are starting to go off and I won't need to explain why your code sample could throw an exception.


Jacob
The GetColumn method doesn't do what I am looking for, as it gets reset after every new line. But I need the absolute position of the caret in the text.

I also looked through the posted link and found this workaround (posted by you). But unfortunatly it is also not reliable. Same problem as above, but here for all kind of chars. The first (and only the first) entered char after a letter doesn't change the reverse engineered Position. The method also bugs for inserted Text via CnP.

So, what to do? Such a simple thing should be possible somehow, right?

2mQ
Coordinator
Sep 2, 2013 at 11:53 PM
Works for me...
// Update the text box when the caret moves
scintilla.NativeInterface.UpdateUI += (s, e) =>
{
    int byteOffset = scintilla.CurrentPos;

    // TODO Cache this and get creative or this will slow down your application
    Range range = scintilla.GetRange(0, byteOffset);
    int charOffset = range.Text.Length;

    textBox.Text = "Byte: " + byteOffset + "; Character: " + charOffset;
};
Jacob
Sep 4, 2013 at 9:45 PM
Hm,

ok I only checked this Event and the workaround seperatly (the workaround only with TextChanged and SelectionChanged). Both together do work. I don't understand how thats possible but ok.

Thank you very much for your detailed answers and this solution.

Just one more question, what do you mean with "TODO Cache this and get creative or this will slow down your application"? Didn't you already cache the position in the local variable charOffset?

2mQ
Coordinator
Sep 4, 2013 at 11:20 PM
Edited Sep 4, 2013 at 11:26 PM
Glad you got things worked out...

My suggestion to cache the character offset is because the approach I showed you is expensive. Under the hood, ScintillaNET is copying the bytes of the selected range out of the native Scintilla control and into managed memory. From there it converts the bytes from UTF-8 format used by Scintilla into a .NET compatible UTF-16 string format. That's a lot of memory copying, and if you plan on doing that every time the caret changes position I have no doubt you will see the performance hit. My suggestion is to cache it so you don't do that every time the cursor changes position. For example, you might:

• Store the last position (or several) in a byte-to-character mapping table and, assuming the document contents haven't changed, just lookup your pre-calculated value. Or if you get creative, detect what type of document change occurred and only update the rows in your table that need to be re-computed.

• Calculate the character position of the text up to the line above where the caret currently is. Then you would only have to calculate it from the start of the current line to the caret position and add it to your already calculated value.

• Scintilla does also provide some more advance ways of getting direct access to the internal memory buffer. You might look into accessing the bytes directly and doing your own conversion from UTF-8 to speed up the process and eliminate an intermediate copy.

• Change your application so it doesn't have to update the current position in real-time. :)

• etc...


Jacob

P.S. - The UpdateUI event fires frequently and not just because of caret position changes. It will fire every time the window gets painted (by resizing or obscuring it behind another window). It will fire every time the screen is scrolled. It will fire when the document gets restyled (colorized). etc... The first step in optimizing any code in the UpdateUI event is to determine if the caret has even changed positions at all.