Problem with German umlauts [solved]

Topics: Developer Forum, User Forum
Nov 20, 2009 at 12:52 PM
Edited Nov 23, 2009 at 1:47 PM


I came across a strange problem and I'm not sure if this is a really Bug.

If you put german umlauts into Scintilla these characters seems to be represented through 2 characters.

e.g. my Text in Scintilla is "abö" the the Property scintilla.Caret.Position & the scintilla.CurrentPos return both 4

Is there a way to get the Number of characters that are actualy in the Text, or at least a Method to determine how many characters,

are Characters with more than 1 internal number (e.g. ö => byte[] { 195, 182 } ) ?

Thanks in Advance



After a little bit of testing, I found out, that through changing the code page to ISO 8859-1 ( SetCodePage(28591) ),

I get the umlauts to be 1 byte. The problem I have now, is that Scintilla does not contain the umlauts, instead it contains weird characters.

Nov 23, 2009 at 2:15 PM
Edited Nov 24, 2009 at 8:03 AM

So I just found a solution.

I set the Encoding of Scintilla to windows-1252 (ANSI).

Now I only don't understand why I get in the scintilla_charadded Event in the EventArgs Property Ch (character) a ö

but when I look at the Text Property of Scintilla there is "ö"


Ok Now I have another problem, is there a way to set encoding of the Autocompletelist?

Nov 24, 2009 at 10:50 PM

ö is a character that can be represented in an ANSI + Codepage as 1 byte but in UTF8 requires 2 bytes since it's > 127 and < 256. To determine how many bytes are required for a given character/string use Encoding.GetByteCount().

int utf8Bytes = UTF8Encoding.UTF8.GetByteCount("ö"); // yields 2
int ansiBytes = UTF8Encoding.ASCII.GetByteCount("ö"); // yields 1
For autocomeplete the document encoding should be universal, but give me an example of something that doesn't work and I'll take a look.

Nov 26, 2009 at 2:20 PM

Hey Chris,

thank you for this explanation. I solved my problem through the usage of the Method GetWordFromPosition instead of counting back from the current position to the first upcoming whitespace.

Best regards