C++, New Lexer

Topics: User Forum
Apr 15, 2012 at 6:01 PM
Edited Apr 15, 2012 at 8:55 PM

Hi guys.

I'm developing a Source Code editor for a fully different language, so I attempted to create a lexer in C#, I saw the IniLexer sample and I did my own lexer based on that (With some optimizations posted by some users in the topic about the IniLexer).

But when you copy a large amount of text, is a bit slow to process all the text (Tested with a inbuild lexer and was very fast), So I think that is better to do the lexer with C/C++, can anyone please explain how to implement a lexer in ScintillaNET made with C/C++ ? And how thedll structure should be?

 

Another question, a character like 'Ã' is treated as two characters, for example:

_scintilla.Text = "Ã"
int len = _scintilla.TextLenght; // TextLenght returned 2!!

How to fix?

Thanks in advance, sorry for my bad english.

Coordinator
Apr 20, 2012 at 3:50 PM

can anyone please explain how to implement a lexer in ScintillaNET made with C/C++ ? And how thedll structure should be?

You're pretty much on your own if you want to implement a native Scintilla lexer. We don't currently have support for that scenario in ScintillaNET. Information about creating native lexers in C/C++ can be found on the Scintilla website: scintilla.org.

I'm pretty confident, however, that C#/.NET is capable of meeting all your lexing needs and you may just need to be more particular about how much text you are processing at a time.

 

Another question, a character like 'Ã' is treated as two characters, for example:

_scintilla.Text = "Ã"
int len = _scintilla.TextLenght; // TextLenght returned 2!!

How to fix?

This is a long running issue with most newbies in our forums. If you poke around you'll find plenty of discussions on this topic. In short, it is because the TextLength property (and many other properties in ScintillaNET) represent the number of bytes, not chars. Unicode characters like 'Ã' require two bytes to represent and hence the TextLength property returns 2.