FixBrowser
data:image/s3,"s3://crabby-images/0fd18/0fd18d8098ab339f706684c0977e646abadce954" alt=""
Blog
FixBrowser: Inside the rich text component (+web demo) (2025/02/10)
This blog post describes the rich text component used in FixBrowser web browser to display HTML pages and for interaction with the user.
About FixBrowser
FixBrowser is a lightweight web browser created from scratch. It is designed from ground up with privacy in mind. This is achieved by having a whitelist approach where only the resources strictly needed by the website are loaded and by having JavaScript disabled (wait! read the next paragraph before dismissing it).
Many websites are however not usable with JavaScript disabled, therefore FixBrowser contains an updated set of fix scripts that fix and even improve various websites as well as groups of websites (using the same common technology such as WordPress, Disqus forums, etc.).
FixBrowser is not usable for practical usage yet. However you can use the FixProxy tool that uses the "backend" part of the browser (everything other than the actual rendering/layouting) in a regular web browser. I've been using this proxy as my primary way of web browsing for multiple years with good results.
The approach
The complexity of implementing a web browser from scratch was greatly reduced by not supporting JavaScript. This creates a cascade effect of simplification on every component and the overall architecture.
In a regular browser the complexity is very high because JavaScript can trigger any kind of update at any time. This requires to maintain complex data structures that store the relationships between the HTML, CSS and view models:
- HTML model - the "physical" layout of the web page
- CSS model - applied styling using the cascade rules based on HTML element hierarchy (this already adds quite a complexity)
- view model - represents the blocks containing "paragraphs" (can be simply
div
s) and inline elements (such as text) in it, it must handle float elements and layers (this one has quite different structure from the other models and simple changes to CSS properties can trigger complex changes in the view model)
The view model maintains it's own hierarchy by having a
stacking context.
It is hierarchical to allow things like having position: relative
to act as an "anchor" to which
nested positioned elements are relative to.
When JavaScript is not supported at all it allows to decouple all the models from each other and follow the UNIX philosophy of "do one thing and do it well". Instead of having everything intermixed together you can process each model separately. This means the processing can go in a straight way: HTML → CSS → view model.
There is no need to maintain complex data structures that allow to change HTML or CSS dynamically. This saves both CPU and memory usage and is much simpler to implement. For example applying of the CSS cascade rules can be achieved by a simple recursion. The layers can be reduced to a simple linear list instead of being hierarchical. After the processing is done, only the data structures of the view model remain.
Rich text component
The rich text component is designed as a toolkit for implementing your own representations. It can be used for a simple edit box or for a complex rendering of the web pages or anything between. More complex representations (such as tables) are achieved by nesting the rich text layers.
The core of the component is a Layer
. It contains zero or more blocks (eg. paragraphs)
where each block contains inline text, inline objects or floating objects. Both blocks and inline
objects have attached their style information.
The inline objects are layouted into separate rows. The text is split if it's longer than a single row. Each row has the height of the tallest object present on the row and text/objects are aligned to a common baseline.
Block
The model is available in the browser/richtext/model.fix
file.
It actually consists of just a single class Block
that contains a packed array of structures
containing flow
s. There are two kinds of flow
s: text and objects. Text is
handled specially as it can be word wrapped. Each flow contains an opaque reference to style
,
this is not interpreted by the rich text component in any way, just passed along for your purposes.
The Block
has also it's own style
attached to it.
The provided methods add
, set
and remove
allow to manipulate
the flows. And the get_flow_*
methods to retrieve information about the stored flows.
Layer
To create a Layer
you need to pass an array of Block
s along with the instance of a ModelToView
class.
This class converts the representation in Block
s to view representation using the
LineBuilder
class. This class is responsible for layouting the individual rows of text and inline objects
to given width. Whenever the Layer
is resized this process is repeated.
You can look at an example how the ModelToView
can be implemented
here.
LineBuilder
It supports these operations:
- adding and clearing of floats on the left or right side
- adding text with word wrapping and inline objects
- adding multiple draw layers for fancy backgrounds and foregrounds
- aligning inline parts to center/right or spreading it to the edges (justify)
Each inline part has it's own handler that is responsible for sizing, painting and handling of the events from the user.
Demo
The demo shows a test of the rich text component. It shows these features:
- floats and the layouting of them in a non-trivial configuration
- word wrapping the text including treating text that has different styles but no spaces between as one "word"
- inserting buttons and text fields as inline objects, allowing for interactive elements
- testing the overdraw feature that allows inline objects to be bigger than the row (eg. to support shadows)
- support for tables by using nested layers and having the table as an inline object
- support for overflowing when some inline objects is wider than the layer (horizontal scrollbar appears)
- ability to highlight the internal parts that the text and inline objects were converted to
The model of the shown content is defined in code here.
How it is used for rendering of HTML pages
The features of the rich text component is all that is needed to support rendering full HTML pages. Some examples of how things are handled:
-
the block elements (such as
div
s) are converted to a singleBlock
containing an inline object with full width and having the content in a nestedLayer
-
inline block elements (
display: inline-block
) are handled like the block elements with the difference that they are not put in a dedicatedBlock
but are alongside other inline elements - the CSS layers and the stacking context is resolved and flattened to a linear list of layers and mapped to the rich text layers
-
the hierarchical elements for inline elements (such as
span
s and other formatting elements) are flattened to a linear representation - styling of inline elements (such as nested borders and backgrounds) are converted to a linear representation as well, the hierarchical nature is then emulated in an inline part handler (gives the right visual result but the rich text component just sees a linear list of inline parts)
-
tables are handled similarly to block elements, each cell has it's own
Layer
to show the content of the cell -
relative positioning (
display: relative
) uses a "mirror" mechanism: the originalLayer
is layouted as normal but the drawing is redirected to another layer that is shown at a different position -
sometimes the width/height of the resulting layout is needed to be calculated (such as to
calculate the widths of table cells), this is handled by having a special mode that tries
to layout the
Layer
for a very wide width and the actual used width is obtained (some operations are omitted in this mode, such as aligning of the text or anything that would try to "fill" the whole width)
The result is that the rich text component can be fully developed and all edge cases resolved without any concern of the complexities of rendering HTML pages. This would be impossible with JavaScript support as that would require to handle everything in one giant interconnected mess instead of having a nice separation of concerns.
Unicode support
My general stance about Unicode support is to simplify the processing of Unicode in a way that makes it simple to use by other code. If that can't be done there is a question if that particular feature should be supported. I'm willing to not support every language out there if otherwise it would mean that the implementation complexity would spike as that has a cascade effect on everything.
Characters in Unicode are hard. They can be a single code point, or a combination of multiple code points. Many characters can be represented in multiple ways (eg. the precomposed vs decomposed characters) but they should be treated as equal. Each "true" character, as perceived from an user perspective, is called a grapheme cluster.
They should be treated as a single character, that means selection and editing actions should treat it as an opaque block. However when writing characters it needs to be temporarily handled specially as the user needs to add code points to compose the final character by using a specific input method.
Grapheme clusters
As explained before the overall approach to the design is to follow the UNIX philosophy of "doing one thing and doing it well". Mixing multiple aspects together, such as handling complexity of Unicode with rich text handling, is against this philosophy.
Instead I've came up with a simple solution (finding such simple solutions can be surprisingly hard though) that decouples these problems. FixBrowser is written in an extensible language (FixScript) that allows a wide range of extensions both directly by the language and by adjusting the native runtime. This allowed to implement a dynamic allocation of characters that represent whole grapheme clusters. It even integrates with the garbage collector (GC) to deallocate characters that are no longer used.
This way the rich text implementation can be blissfully ignorant about the whole concept and work with grapheme clusters as single characters like in the good old pre-Unicode days when you dealt with just one language/encoding.
Right-to-left languages
I have not done any implementation work on this yet but from the preliminary research it doesn't appear to be that complicated to add. It affects the direction of layouting of inline objects and how the selection is done. Something that can be directly handled by the rich text component without adding too much complexity (famous last words).
Fundraising
The project needs your help. For this initial round €5000 will be needed to be raised. It will allow me to work on FixBrowser to make it usable for actual browsing and also implement some of the additional areas based on a vote. The work will take about a year, expecting major improvements within about 6 months.
Comments
No comments.
Add comment
The website was designed for modern browsers and IE4+.