This project is read-only.

Custom Lexer glitch

Topics: Developer Forum, User Forum
Mar 9, 2013 at 6:04 AM
Hi, i recently started work with ScintillaNet and i found that using a custom lexer was heaps easier than using an existing one. So i downloaded the one that was put up by Jocobs in change set 45206.
Made some modifications to structure it more my style but for the most part is same same (if you want ill post my code up).
So here's the issue. I get a weird glitch when lexing stuff and i think its to do with communication with the actual Scintilla object.
I start typing and everything is fine. but at about 10 characters i start losing the formatting and styling i should have. With strings ( ' used to mark it ) it works perfectly on the first and then if i put normal text and then another string the formatting seems to override the normal text and not the actual text. I have no idea why..
Mar 9, 2013 at 6:20 AM
More info: it seem to be styling only 1/2 + 1 of the characters in each line. Im still clueless. I think it MAY have something to do with the StartStyling() function passing in 0 and FF).
Mar 9, 2013 at 6:35 AM
Seams to be ok now. had some duplicate index increments, BUT NOW IVE LOST MY COLOUR ;( it hates me
Mar 9, 2013 at 7:22 AM
Edited Mar 9, 2013 at 7:25 AM
Once again more info on this issue. Color only seems to be applying on the Default style.
And im getting impatient with this glitchy software.
I tried different colors on all styles and different style indexes to no avail.
Mar 9, 2013 at 4:00 PM
Post your code and we might be able to offer more assistance.

Jacob
Mar 9, 2013 at 5:32 PM
Lexers are always a fun subject, and it sounds to me like your running into the exact same issues that I ran into when putting the custom lexer framework into the WPF branch, and overcame, so I might suggest looking at that branch. The structure of the custom lexers I implemented is a bit different though, they are all state machine based, which is closer to how the native lexers are written. Also, custom lexers implemented in that branch are used in the exact same way as native lexers, and will override the native lexer if you define a custom lexer with the same name as a native lexer. (this is how I was able to implement the SQL and VB lexers as custom lexers)

If however, you wish to not use that branch, I can provide some help. Firstly, the StartStyling function takes 2 parameters, the first is the overall position in the document where you want to start styling, and the second is the style mask. If the index of the style your setting in the document isn't covered by that mask, it won't apply that style. Secondly, when styling, make sure you style the normal characters as the default style (for consistency with the native lexers, you should be using index 0 as the default style, not index 32 like the iniLexer does), this is needed so that the style pointer (where in the document the style is going to be applied) is kept at the same point that you are at in your custom lexer.
Mar 9, 2013 at 10:58 PM
using System;
using System.Collections.Generic;
using System.Linq;
using System.Windows.Forms;
using ScintillaNET;
using System.Drawing;

namespace ScintillaTest
{
class CtlLexer
{
    private const int EndOfLine = -1;
    enum eSTYLES
    {
        DEFAULT = 32,

        STRING = 11,
        NUMBER = 12,
        COMMENT = 14,
        OPERATOR = 15
    }

    private Scintilla _boundControl;
    /// <summary>
    /// The Scintilla control that is having its content lexed by this lexer.
    /// </summary>
    public Scintilla BoundControl
    {
        get
        {
            return _boundControl;
        }
        set
        {
            _boundControl = value;
            InitialiseLexing();
        }
    }

    private int _lineIndex;
    private string _currentLine;

    /// <summary>
    /// Constructor initializing a default CtlLexer object.
    /// </summary>
    public CtlLexer()
    { }

    #region UtilityFunctions

    private void SetStyleValue(eSTYLES index, Color? colour = null, int? size = null,
        Color? background = null, Font font = null, bool? bold = null,
        bool? italic = null, bool? underline = null, bool? visible = null)
    {
        if (colour != null)
            BoundControl.Styles[(int)index].ForeColor = (Color)colour;
        if (bold != null)
            BoundControl.Styles[(int)index].Bold = (bool)bold;
        if (italic != null)
            BoundControl.Styles[(int)index].Italic = (bool)italic;
        if (underline != null)
            BoundControl.Styles[(int)index].Underline = (bool)underline;
        if (background != null)
            BoundControl.Styles[(int)index].BackColor = (Color)background;
        if (font != null)
            BoundControl.Styles[(int)index].Font = font;
        if (size != null)
            BoundControl.Styles[(int)index].Size = (int)size;
        if (visible != null)
            BoundControl.Styles[(int)index].IsVisible = (bool)visible;
    }

    /// <summary>
    /// Enables custom lexing on the currently bound Scintilla control.
    /// </summary>
    public void InitialiseLexing()
    {
        if (BoundControl != null)
        {
            BoundControl.Margins[0].Width = 30;
            BoundControl.Indentation.IndentWidth = 4;
            BoundControl.Indentation.SmartIndentType = SmartIndent.CPP;

            BoundControl.ConfigurationManager.Language = "";
            BoundControl.Lexing.LexerName = "container";
            BoundControl.Lexing.Lexer = Lexer.Container;

            InitialiseStyles();
        }
    }

    /// <summary>
    /// Reads a character from the current line of the Scintilla text.
    /// </summary>
    /// <returns>An integer representing the character (or the End of Line value).</returns>
    private int Read()
    {
        if (_lineIndex < _currentLine.Length)
            return _currentLine[_lineIndex];
        return EndOfLine;
    }

    /// <summary>
    /// This will style the length of chars and advance the style pointer.
    /// </summary>
    /// <param name="style">The style to use for the styling.</param>
    /// <param name="length">The length of characters to style.</param>
    private void SetStyle(eSTYLES style, int length)
    {
        if (length > 0)
        {
            ((INativeScintilla)BoundControl).SetStyling(length, (int)style);
            _lineIndex += length;
        }
    }

    /// <summary>
    /// Styles just 1 character and advances the index.
    /// </summary>
    /// <param name="style"></param>
    private void StyleChar(eSTYLES style)
    {
        SetStyle(style, 1);
    }

    /// <summary>
    /// Styles text with a given style until a delimiter is hit.
    /// </summary>
    /// <param name="style">The style to use on the text.</param>
    /// <param name="delimiters">The array of delimiters that stop the styling.</param>
    private void StyleUntilMatch(eSTYLES style, char[] delimiters)
    {
        int finalIndex = _lineIndex;
        finalIndex++;

        while (finalIndex < _currentLine.Length)
        {
            if (Array.IndexOf<char>(delimiters, _currentLine[finalIndex]) >= 0)
            {
                break;
            }
            finalIndex++;
        }

        if (finalIndex != _lineIndex)
            SetStyle(style, finalIndex - _lineIndex);
    }

    /// <summary>
    /// Advances the index to the first non-whitespace character.
    /// </summary>
    private void StyleWhitespace()
    {
        // Advance the index until non-whitespace character
        int startIndex = _lineIndex;

        while (_lineIndex < _currentLine.Length && Char.IsWhiteSpace(_currentLine[_lineIndex]))
            _lineIndex++;

        SetStyle(eSTYLES.DEFAULT, _lineIndex - startIndex);
    }
    #endregion UtilityFunctions

    /// <summary>
    /// Sets the formatting for the styles of the currently bound Scintilla control.
    /// </summary>
    public void InitialiseStyles()
    {
        SetStyleValue(eSTYLES.DEFAULT, Color.RoyalBlue);
        SetStyleValue(eSTYLES.COMMENT, Color.Green);
        SetStyleValue(eSTYLES.STRING, Color.Red);
        SetStyleValue(eSTYLES.NUMBER, Color.Red);
        SetStyleValue(eSTYLES.OPERATOR, Color.Black, null, null, null, true);
    }

    /// <summary>
    /// Runs this lexer on the bound control and styles its content.
    /// </summary>
    public void RunLexer()
    {

        if (BoundControl == null)
            return;

        // Start styling from 0 (can change to other if you feel sure of where
        //   changes have occured).
        ((INativeScintilla)BoundControl).StartStyling(0, 0xfa);

        string[] lines = BoundControl.Text.Split(new string[] {System.Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries);
        if (lines.Length == 0)
            return;

        for (int i = 0; i < lines.Length; i++)
        {
            _currentLine = lines[i];

            for (_lineIndex = 0; _lineIndex < _currentLine.Length; )
            {
                if (i > 0)
                    SetStyle(eSTYLES.DEFAULT, 2);

                switch (Read())
                {
                    case '\'':
                        StyleUntilMatch(eSTYLES.STRING, new char[] { '\'' });
                        StyleChar(eSTYLES.STRING);
                        break;
                    case '=':
                    case '>':
                    case '<':
                    case '!':
                    case '(':
                    case ')':
                    case '[':
                    case ']':
                    case ';':
                        StyleChar(eSTYLES.OPERATOR);
                        break;
                    case '#':
                        SetStyle(eSTYLES.COMMENT, _currentLine.Length - _lineIndex);
                        break;
                    case EndOfLine:
                        break;
                    default:
                        StyleChar(eSTYLES.DEFAULT);
                        break;
                }
            }
        }
    }
}
}

Now, ive got formatting working on the default, but when it should format a string or comment it does no formatting (not even default). Also i am looking for a way to Force it to re-style the ENTIRE document (even tho i re-apply the styles every change, it doesnt seem to re-apply these styles to the document for some reason).
Thanks for your help :)
Mar 9, 2013 at 11:18 PM
Edited Mar 9, 2013 at 11:35 PM
Well, although I really don't like the way you're doing it, it is how the IniLexer was written, so I can only complain so much, especially if you haven't written a lexer before. The biggest reason I don't like it is because it's slow, and will start causing issues as the size of the document you're working with grows.

As to why the styles aren't applying, it's because you are calling StartStyling with 0xFA as the mask, meaning when you go to set a style, for example a comment, which is 14, it's setting the style to 10, because of the mask.

Give me a few minutes and I'll edit this post to include a lexer written for the WPF branch (the WPF support is simply a wrapper around the winForms version) that does what you have yours currently doing.

Edit:
And here it is:
using System;
using System.Drawing;
using System.Collections.Generic;

namespace ScintillaNET.Lexers
{
    public sealed class CtlLexer : CustomLexer
    {
        public override string LexerName { get { return "ctl"; } }


        private const int STYLE_STRING = 11;
        private const int STYLE_NUMBER = 12;
        private const int STYLE_COMMENT = 14;
        private const int STYLE_OPERATOR = 15;

        public CtlLexer(Scintilla scintilla) : base(scintilla) { }

        private enum State : int
        {
            Unknown = STATE_UNKNOWN,
            String,
            Comment,
        }

        new private State CurrentState
        {
            get { return (State)base.CurrentState; }
            set { base.CurrentState = (int)value; }
        }

        protected override void InitializeStateFromStyle(int style)
        {
            switch (style)
            {
                case STYLE_STRING:
                    CurrentState = State.String;
                    break;
                // Otherwise we don't need to carry the
                // state on from the previous line.
                default:
                    break;
            }
        }

        protected override void Style()
        {
            StartStyling();

            while (!EndOfText)
            {
                switch (CurrentState)
                {
                    case State.Unknown:
                        bool consumed = false;
                        switch (CurrentCharacter)
                        {

                            case '\'':
                                CurrentState = State.String;
                                break;
                            case '=':
                            case '>':
                            case '<':
                            case '!':
                            case '(':
                            case ')':
                            case '[':
                            case ']':
                            case ';':
                                Consume();
                                SetStyle(STYLE_OPERATOR);
                                consumed = true;
                                break;
                            case '#':
                                CurrentState = State.Comment;
                                break;
                            default:
                                if (IsWhitespace(CurrentCharacter))
                                {
                                    ConsumeWhitespace();
                                    consumed = true;
                                }
                                else
                                {
                                    SetStyle(STYLE_DEFAULT);
                                }
                                break;
                        }
                        if (!consumed)
                            Consume();
                        break;
                    case State.Comment:
                        ConsumeUntilEOL(STYLE_COMMENT);
                        CurrentState = State.Unknown;
                        break;
                    case State.String:
                        // We're using '\0' as the escape character because a NUL character
                        // doesn't usually appear in text-based documents.
                        ConsumeString(STYLE_STRING, '\0', false, false, '\'');
                        consumed = true;
                        CurrentState = State.Unknown;
                        break;
                    default:
                        throw new Exception("Unknown state!");
                }
            }

            switch (CurrentState)
            {
                case State.Unknown: break;
                case State.Comment:
                    SetStyle(STYLE_COMMENT);
                    break;
                case State.String:
                    SetStyle(STYLE_STRING);
                    StyleNextLine(); // Continue the style to the next line
                    break;
                default:
                    throw new Exception("Unknown state!");
            }
        }
    }
}
You would then add an xml config file for the language, and set the document's language to ctl.
Mar 9, 2013 at 11:21 PM
I don't see anywhere in your code where you're handling the StyleNeeded event.... When you set the Lexer to Container, Scintilla will raise the StyleNeeded event when it needs you to style the text. It tells you when to style, not the other way around.

Not handling this event may be what's causing your problem.

Jacob
Mar 10, 2013 at 12:21 AM
thanks to both of you :) i will compare my code with yours Blah and see how it works. I wanna make it work before i optimise but im only going up to 32k characters so should be right.

I call this on the TextChanged event of the control. But if StyleNeeded is preferential i will change it.
Mar 10, 2013 at 12:28 AM
Edited Mar 10, 2013 at 12:30 AM
Blah: Your code seems to do stuff totally different and its not plug and play (prob something i need to set up) so i dont have any frame of reference to integrate it/use it. I appreciate the effort but filling in the gaps would be nice (speak to me like a 3 year old :) ).

If i wanted to continue using the current solution, what would i set the flag to? I dont know what the flag does since its not written anywhere :(
Mar 10, 2013 at 12:36 AM
O YEAH, set flag to 0xff and it all seams to work now xD Love you for your help :)

.. but.. it would be nice to have a high-detail, consolidated documentation. To be honest, its the only thing your software is missing that is stopping it from being, hands-down the best embedding tool in existance.

Thanks again :)
Dec 9, 2013 at 4:45 PM
Hello,
I'm looking at ways to syntax highlight a sql like text.
My files are sql-templates with some special tag that I want to highlight
for exemple I've param placeholders of the type {{param_name}} tha I want to color in pink
In Notepad++ I've defined a "custom language" using the following workaround:
In a keyword list I've iserted "{{" and I've set prefix mode
so for exemple {{test}} and {{my_param}} will be colored with the group style.
Is possibile to do the same in ScintillaNet or "prefix mode" is specific of notepad++ ?

If I understand, I can't start from an existing language like mssql and add this behaviour
but I've to implement a CustomLexer.

There is a complete project example with a basic CustomLexer from which start?

thanks
Mar 26, 2014 at 2:40 AM
The demonstration program SCide includes a basic INI custom lexer you can learn from. The implementations provided by blah38621 in the WPF branch (and in the thread above) are much more capable, however.

You might also be able to avoid the custom lexer thing if your language is truly like SQL with the exception of the {{param_name}} sequence by using the SQL lexer and then some combination of FindReplace.FindAll and FindReplace.HighlightAll (or Range.SetIndicator) to search for the {{param_name}} sequence and highlight it. It's quick and dirty but it might work.


Jacob