TeX/Latex markup

Thursday, September 26, 2013

The End

The Google Summer of Code 2013 is over. And this is great opportunity to summaries what is done and working and what still waits to be implemented in LaTeX interpreter.

New interpreters skeleton

As part of my project I made a new abstract class named base_text_render. This class presents the core functionality of every interpreter when using 'fltk' graphics toolkit. There is an text_render working as top wrapper pointing to one of three interpreters. User can directly pick ft_render or latex_render class. And if there is no necessary library to use the render user picked he will be pointed to dummy_render class, which can't do nothing basically. It's here just to prevent breaking of the program. Here is picture visualizing the class inheritance.

*- Class inheritance picture -*

User can also use 'tex' interpreter, this choice will point him to ft_render class. But instead of using freetype characters it will be using characters defined withing tex symbols.

Implemented functionality

At the end of this Summer of Code there is core functionality within LaTeX interrpeter. It can be used only for on-screen rendering. Users can use following text properties:

background color - can't be used directly as text property, but users can use it as the LaTeX command \colorbox{declared-color}{Text}
color - can be used within text properties as color parameter and directly in LaTeX as the command \textcolor{declared-color}{Text}

Users can use all color commands that LaTeX accept within package color.

fontangle - can't be used as the text property, just directly withing commands, \textit{Text} for 'italic', \textsl{Text} for 'oblique' text
fontname - can be used directly in Octave, but when using LaTeX interpreter acceptable fonts are only the Latex fonts
fontsize - can be used directly
fontweight - can be used from just as Latex commands, \textbf{Text} for the 'bold' font and \textmd{Text} 'demi'.
rotation - users can rotate text for 0, 90, 180 and 270 degrees.
horizontalalignment - acceptable is 'left', 'right' and 'center' as default alignment.
verticalalignment - users can use 'middle', 'top', 'baseline', 'cap' and 'bottom' as default.

NOTE: Default state of parameter are for the title, for other text object they can be different.

All this is tested on Ubuntu 13.04 and it's working nicely and quite fast. Maybe a bit slower then when using ft_render. But the difference is tested and it present 10 - 50 mili seconds. This can be closer to upper number when using Octave with GUI. I'm quite confident that it will work same on all Linux based platforms. With some difference in speed, depending on how good is the hardware.

Things that could/should be done

It should be tested (it have been already and the test fails, so when my private obligations let me loose, I will try to fix it) on the Windows. After that normally on the some Mac system.

The already working on screen rendering could be upgraded with some new functionality that already exist in the ft_render class. Let say lineheight property and better multi line rendering.

As the part of project, after start of coding my mentor sugested to implement this interpreter on the printing side too. But the time to do this didn't left, so for now it's on hold. Maybe there are another things that can be done, but as the author on this project I don't see them now.

P.S: All files that I touched and they are relevant for this interpreter are txt-render.h/.cc ; latex-render.h/.cc ; txt-eng-ft.h/.cc and graphics.cc . My online repository is here: http://inversethought.com/hg/octave-lojdl

Monday, August 26, 2013

On screen LaTeX interpreter

On screen rendering using LaTeX interpreter changed few times. There here problems regarding output image output format. First I used the PPM format, but it doesn't support transparency. So it was changed to a BMP format. After some time implementing the reader for this format and testing, result was that GhostScript don't work nice with this format. So it was changed again to PNG format.

With new format there are nice transparent rendering of LaTeX file ( constructed in Octave ) and GhostScript use subsample anti aliasing to produce nice image. This data is then read from PNG image and transferred to OpenGL buffer.

Upgraded interpreter have some new functionality. It can change color of text in octave directly and then apply this color to all text, that isn't previously colored. This exception is added because user can
use coloring of text directly in input string this way user can have multicolored text. User can change the size of font directly in Octave using 'fontsize' parameter, example bellow.

graphics_toolkit('fltk');
plot(1,1);
title('Text \color{red}{colored}','interpreter','latex','color','blue');

To change size of text user can use 'fontsize' property. Trough this property user change the resolution of output from GhostScript and this is scaled to represent direct change of font size in points.

title('Text','interpreter','latex','color','blue','fontsize',48);

And here is example of using the mathematics special characters:

title('$\oint \! \nabla f \, \dif t = 0$','interpreter','latex','fontsize',48);

As you can see it needs to be encapsulated inside dollar signs. Adding bold, italic and others type of text is an option, but as Matlab don't support this we won't support this also.

As you can see there are some things that could be upgraded. First and obvious is alignment, but this isn't the priority. Next step is to check for libraries and dependencies. And to configure this new interpreter to work on all large platforms. To check how development is evolving feel free to download from my online repository.

Tuesday, August 6, 2013

New timeline - part 2

The evaluation period is over. And the second part of Google Summer of Code has started. So here is the second part of timeline for my project.

Week 1 - [ 05.08 ÷ 09.08] - Upgrade on-screen rendering

The proof of concept have some flaws that need to be fixed and generally Latex interpreter upgraded. First thing is to change image format from PPM to some format enabling GhostScript to store information about transparency. Focus is set to BMP format. It's part of output devices in GS and it's not compressing so it's relatively easy to extract raw data from this file format. The next thing is to shrink image and improve image quality. This part will be done with ImageMagic and GS. There are several additional nice upgrades that are not necessary, like piping dvips and GS , changing font and font size. This will be worked on depending on time.

Week 2 - [ 12.08 ÷ 16.08 ] - Figure out how the printing is working

Until this week there will be functional and fairly good Latex interpreter for on-screen rendering. The plan is to implement this interpreter to printing side. First step of implementation is discovering how exactly printing side works and plan how to implement Latex interpreter.

Week 3/4 - [ 19.08 ÷ 30.08 ] - Implement printing side

This two weeks should be used to implement and fully test Latex interpreter on printing side. For printing EPS image (vector) format will be used. Prining side is responsible for saving plots to. This will enable user to save plot with fancy Latex markups and present his results to other people.

Week 5 - [ 02.08 ÷ 06.08 ] - Configuring interpreter functionality

It would be nice to check if user have all required binaries to run Latex interpreter. And if not, this interpreter should be disabled and proper information printed into terminal to user. Probably explain how to solve this and what exactly he need to have on his system for this to work properly.

Week 6 - [ 09.08 ÷ 13.08 ] - Expand to all platforms

All development is done on Ubuntu 12.10 . Last week of coding period is right time to check how this new functionality is working on others system. After testing the code should be modified. Because programs running in background enabling interpreter have different names and behave differently on different systems.

Tuesday, July 30, 2013

Mid - term

And here it is, mid-term evaluation period is already open. And it seems like yesterday, we begin working on our projects. From last post, there explaining how the class text_render look, I was implementing it to existing code.

Here is a final result , proof of concept latex interpreter. As you can see, it need to be upgraded.

Here's part of Octave code producing this plot.

 graphics_toolkit('fltk');

x=0:0.1:10;

plot(x,sqrt(x));

title('Plot with LaTeX','fontsize',28);

ylabel('y=\sqrt(x)',interpreter','latex');

For now I will point only the main points that should be in next upgrade:
- it should have transparent background
- quality of image should be higher
- image should be smaller

Transparent background
Problem with current revision is that it use PPM format for transferring data to pixels member of any render class. It's handy, because Octave can directly open this file. Read image width and height, and then just read data stored in every pixel ( red, green and blue component ). But it has one big defect , alpha channel is missing. So there have to be another format or some kind of trick to get around this shortage of information.

Antialiasing
Current image is rendered form EPS file at 600 dpi. The idea is to render larger image and then scale it to fit original image at Octave plot. With this trick, rendered LaTeX image will be smoother. With better smooth there's better image quality. ( Note: Rendered image in plot presented here is rendered at 300 dpi to produce smaller image. )

Image size
After placing image on plot is obvious that it's too big. It should be smaller to fit with other text on plot. As mentioned before we want smaller image then rendered, but with better smoothing so it will be pushed into smaller size to get this result.

Thursday, July 18, 2013

Implementation

Octave use freetype library to render text for OpenGL. Trough ft_render class, string is rendered depending on font size, color an other parameters.
Every text object has interpreter parameter, which is by default set to 'tex'. Although Octave has only limited set of tex functionality implemented. All this is going trough ft_render class, and as a result we have information about text in 8NDArray, which is then used by OpenGL for rendering of rasterized image.

When upgrading existing system, programmer have to save already existing functionality and add a new one. To add latex markup, there have to be class for it. And normally some dispatch mechanism on top, which can pick what type of render will be used. Solution to this problem came with abstract class and dispatch mechanism inside wrapper. So here is new class organisation.

text_render present top wrapper class, and inside it there is a dispatch mechanism pointing to one of three classes for text rendering. It also handles memory and use base_text_render class to create and destroy objects, with rep pointer.

base_text_render is abstract class used as parent class to text rendering classes. It has base virtual methods, named base methods on image presented above this text.

ft_render is existing class, using freetype to render text. It inherit base methods and has additional methods. Additional methods computes bounding box, rotate image, change mode of rendering and get extent.

dummy_render class is empty. Is integrated to this concept because we can use latex if there are additional programs needed for this system. And renderer used by default is useless without installed freetype library. So to get around null rep pointer, this class will be picked and it will be just have some method to print error and help users install freetype or/and latex.

latex_render present implemented functions already developed in C. They are modified to use C++ libraries and to work as methods inside this class.
- adapter method - put get input string and then create TEX file for further use.
- render method - use Latex system and GhostScript as described in earlier post. As final result there is bitmap image. But OpenGL renderer needs 8NDArray, so this part is not finished.
- get_bbox method - open EPS file and read bounding box data, which is then transformed to true values

NOTE: Octave community use interesting concept for memory management. There are some abstract class and then wrapper on top. When someone create object it's not handled directly in further code but rep pointers are used. Here is one example liboctave/util/

...

class octave_mutex;

class

octave_base_mutex

{

public:

friend class octave_mutex; 

octave_base_mutex (void) : count (1) { } 

virtual ~octave_base_mutex (void) { } 

virtual void lock (void); 

virtual void unlock (void); 

virtual bool try_lock (void); 

private:

octave_refcount <int>; count;

};

class

OCTAVE_API

octave_mutex

{

public:

octave_mutex (void); 

octave_mutex (const octave_mutex&; m)

: rep (m.rep)

{

rep->count++;

}

~octave_mutex (void)

{

if (--rep->count == 0)

   delete rep;

}

octave_mutex& operator = (const octave_mutex& m)

{

if (rep != m.rep)

{

  if (--rep->count == 0)

     delete rep;

  rep = m.rep;

  rep->count++;

  }

return *this;

}

void lock (void)

{

rep->lock ();

} 

void unlock (void)

{

rep->unlock ();

}

bool try_lock (void)

{

return rep->try_lock ();

} 

protected:

octave_base_mutex *rep;

};

...

Wednesday, July 3, 2013

Form string to image

As you might notice, good part of my project presents converting string with LaTeX markup into bitmap image. As this can't be done directly form string we use some additional programs. Here is little sketch, focusing on how this part is working together.

----------------------------------------------------------------------------------------------------------------------------------

string --> Adapter --> .tex --> LaTeX system --> .dvi --> dvips --> .eps --> Ghostscript --> .png +--> bounding box

----------------------------------------------------------------------------------------------------------------------------------

Adapter is transferring string with integrated markup to LaTeX file, with everything set for further use. File is predefined within code, Adapter just adds string into right place. Working further and additional testing will invoke changing of how exactly this code works but for now it's working great.

LaTeX system isn't part of Octave. So it needs to be additionally installed on platform. It takes a file with structured markup code and convert it to device independent file format.

dvips program can convert this file into encapsulated post script. It does this to get vector image and for boundary box (part of this file is line with coordinates of boundary box).

Ghostscript is part of Octave installation. It takes previous file and convert it to bitmap image. We as user can control image quality with resolution.

Bitmap image and boundary box will be used in next few steps for adding image to OpenGL buffer. This micro system is tested with all kinds of formula markup and for now everything is working okay. With every new step in development, there will be additional testing. There will be added code for controlling image quality, depending of size on plot and fontsize (LaTeX support only 10/11/12 point size of font directly).

New timeline

When someone send application to GSoC, part of this application is proposed timeline. I wrote timeline proposal, thinking about how can this project be split into parts. Then I just arranged this parts according to their among dependence. Original timeline can be found here .

After intro task, I started working on main priority, adding LaTeX markup to Octave. This markup will be new functionality. When interpreter value of text box on some plot is set to "LaTeX", it should be rendered using LaTeX system and then added on plot trough OpenGL rendering. There is a need to redefine timeline. This need arose because different concept of project, than I've imagined. And because I spend sometime to intro task, that should ( it was expected ) be done during bounding period.

TIMELINE << LaTeX markup project >>

Week 1 - (17.06 ÷ 23.06) - Preparation and intro task
This week was consumed for preparation. Preparation consisted
of reading the code. Installing all program necessary for development.
And work on intro task has started.

Week 2 - (24.06 ÷ 30.06) - Intro task and additional depencies
As it turned this intro task - adding lineheight property to text objects
on plots. After finishing this task, patch was made and added to patch
GNU Octave patch manager. Using additional programs parts of LaTeX
system and Ghostscript, markup code was successfully converter to
.eps file and then to .png file, manually.

Week 3 - (01.07 ÷ 03.07) - Adapter for LaTeX system
Adapter for transferring LaTeX markup is working. Additional code
that call dvips and Ghostscript is working too. It does this using system
function in C. Adapter and additional code are connected and tested
using this source for examples of formulas.

Week 4/5 - (08.07 ÷ 19.07) - Image quality and implementing code
Now when there is micro system for transferring markup and converting
markup code to bitmap image ready, it should be implemented directly into
Octave code. Questions are where and how exactly. Another thing to look after
is quality of bitmap images. Characters on image should be smooth, no matter
what size they are on final plot.

Week 6 - (22.07 ÷ 29.07) - Passing image to OpenGL and testing
Final image have to be placed to OpenGL buffer for further rendering. This will be done
modifying text::properties::update_text_extend and similar methods. After this part is
finished it should be tested and if needed upgrading code for better performance.

Mid - term evaluations (30.07 ÷ 01.08)

Until mid - term there should be, as my mentor calls it proof-of-concept. After that it should be only upgraded. First thing is supporting LaTeX markup for printing formats. After that everything will be tested on Windows and OS X platforms. Every problem with other
platforms solved. And in end code should be configured. As example if there is something missing, we cant use LaTeX markup.