summaryrefslogtreecommitdiff
path: root/docs/overview.txt
diff options
context:
space:
mode:
Diffstat (limited to 'docs/overview.txt')
-rw-r--r--docs/overview.txt268
1 files changed, 268 insertions, 0 deletions
diff --git a/docs/overview.txt b/docs/overview.txt
new file mode 100644
index 00000000..2d4e25d7
--- /dev/null
+++ b/docs/overview.txt
@@ -0,0 +1,268 @@
+Contents
+========
+
+* Basic MuPDF usage example
+* Common function arguments
+* Error Handling
+* Multi-threading
+
+Basic MuPDF usage example
+=========================
+
+For an example of how to use MuPDF in the most basic way, see
+doc/example.c. To limit the complexity and give an easier introduction
+this code has no error handling at all, but any serious piece of code
+using MuPDF should use the error handling strategies described below.
+
+Common function arguments
+=========================
+
+Many functions in MuPDFs interface take a context argument.
+
+A context contains global state used by MuPDF inside functions when
+parsing or rendering pages of the document. It contains for example:
+
+ an exception stack (see error handling below),
+
+ a memory allocator (allowing for custom allocators)
+
+ a resource store (for caching of images, fonts, etc.)
+
+ a set of locks and (un-)locking functions (for multi-threading)
+
+Other functions in MuPDF's interface take arguments such as document,
+stream and device which contain state for each type of object. Those
+arguments each have a reference to a context and therefore act as
+proxies for a context.
+
+Without the set of locks and accompanying functions the context and
+its proxies may only be used in a single-threaded application.
+
+Error handling
+==============
+
+MuPDF uses a set of exception handling macros to simplify error return
+and cleanup. Conceptually, they work a lot like C++'s try/catch
+system, but do not require any special compiler support.
+
+The basic formulation is as follows:
+
+ fz_try(ctx)
+ {
+ // Try to perform a task. Never 'return', 'goto' or
+ // 'longjmp' out of here. 'break' may be used to
+ // safely exit (just) the try block scope.
+ }
+ fz_always(ctx)
+ {
+ // Any code here is always executed, regardless of
+ // whether an exception was thrown within the try or
+ // not. Never 'return', 'goto' or longjmp out from
+ // here. 'break' may be used to safely exit (just) the
+ // always block scope.
+ }
+ fz_catch(ctx)
+ {
+ // This code is called (after any always block) only
+ // if something within the fz_try block (including any
+ // functions it called) threw an exception. The code
+ // here is expected to handle the exception (maybe
+ // record/report the error, cleanup any stray state
+ // etc) and can then either exit the block, or pass on
+ // the exception to a higher level (enclosing) fz_try
+ // block (using fz_throw, or fz_rethrow).
+ }
+
+The fz_always block is optional, and can safely be omitted.
+
+The macro based nature of this system has 3 main limitations:
+
+1) Never return from within try (or 'goto' or longjmp out of it).
+ This upsets the internal housekeeping of the macros and will
+ cause problems later on. The code will detect such things
+ happening, but by then it is too late to give a helpful error
+ report as to where the original infraction occurred.
+
+2) The fz_try(ctx) { ... } fz_always(ctx) { ... } fz_catch(ctx) { ... }
+ is not one atomic C statement. That is to say, if you do:
+
+ if (condition)
+ fz_try(ctx) { ... }
+ fz_catch(ctx) { ... }
+
+ then you will not get what you want. Use the following instead:
+
+ if (condition) {
+ fz_try(ctx) { ... }
+ fz_catch(ctx) { ... }
+ }
+
+3) The macros are implemented using setjmp and longjmp, and so
+ the standard C restrictions on the use of those functions
+ apply to fz_try/fz_catch too. In particular, any "truly local"
+ variable that is set between the start of fz_try and something
+ in fz_try throwing an exception may become undefined as part
+ of the process of throwing that exception.
+
+ As a way of mitigating this problem, we provide an fz_var()
+ macro that tells the compiler to ensure that that variable is
+ not unset by the act of throwing the exception.
+
+A model piece of code using these macros then might be:
+
+ house build_house(plans *p)
+ {
+ material m = NULL;
+ walls w = NULL;
+ roof r = NULL;
+ house h = NULL;
+ tiles t = make_tiles();
+
+ fz_var(w);
+ fz_var(r);
+ fz_var(h);
+
+ fz_try(ctx)
+ {
+ fz_try(ctx)
+ {
+ m = make_bricks();
+ }
+ fz_catch(ctx)
+ {
+ // No bricks available, make do with straw?
+ m = make_straw();
+ }
+ w = make_walls(m, p);
+ r = make_roof(m, t);
+ // Note, NOT: return combine(w,r);
+ h = combine(w, r);
+ }
+ fz_always(ctx)
+ {
+ drop_walls(w);
+ drop_roof(r);
+ drop_material(m);
+ drop_tiles(t);
+ }
+ fz_catch(ctx)
+ {
+ fz_throw(ctx, "build_house failed");
+ }
+ return h;
+ }
+
+Things to note about this:
+
+a) If make_tiles throws an exception, this will immediately be
+ handled by some higher level exception handler. If it
+ succeeds, t will be set before fz_try starts, so there is no
+ need to fz_var(t);
+
+b) We try first off to make some bricks as our building material.
+ If this fails, we fall back to straw. If this fails, we'll end
+ up in the fz_catch, and the process will fail neatly.
+
+c) We assume in this code that combine takes new reference to
+ both the walls and the roof it uses, and therefore that w and
+ r need to be cleaned up in all cases.
+
+d) We assume the standard C convention that it is safe to destroy
+ NULL things.
+
+Multi-threading
+===============
+
+First off, study the basic usage example in doc/example.c and make
+sure you understand how it works as the data structures manipulated
+there will be refered to in this section too.
+
+MuPDF can usefully be built into a multi-threaded application without
+the library needing to know anything threading at all. If the library
+opens a document in one thread, and then sits there as a 'server'
+requesting pages and rendering them for other threads that need them,
+then the library is only ever being called from this one thread.
+
+Other threads can still be used to handle UI requests etc, but as far
+as MuPDF is concerned it is only being used in a single threaded way.
+In this instance, there are no threading issues with MuPDF at all,
+and it can safely be used without any locking, as described in the
+previous sections.
+
+This section will attempt to explain how to use MuPDF in the more
+complex case; where we genuinely want to call the MuPDF library
+concurrently from multiple threads within a single application.
+
+MuPDF can be invoked with a user supplied set of locking functions.
+It uses these to take mutexes around operations that would conflict
+if performed concurrently in multiple threads. By leaving the
+exact implementation of locks to the caller MuPDF remains threading
+library agnostic.
+
+The following simple rules should be followed to ensure that
+multi-threaded operations run smoothly:
+
+1) "No simultaneous calls to MuPDF in different threads are
+ allowed to use the same context."
+
+ Most of the time it is simplest to just use a different
+ context for every thread; just create a new context at the
+ same time as you create the thread.
+
+2) "The document is bound to the context with which it is created."
+
+ All subsequent accesses to the document implicitly use the same
+ context; this means that only 1 thread can ever be accessing
+ the document at once. This does not mean that the document can
+ only ever be used from one thread, though in many cases this
+ is the simplest structure overall.
+
+3) "Any device is bound to the context with which it is created."
+
+ All subsequent uses of a device implicitly use the context with
+ which it was created; this means that if a device is used with
+ a document, it should be created with the same context as that
+ document was. This does not mean that the device can only ever
+ be used from one thread, though in many cases this is the
+ simplest structure overall.
+
+So, how does a multi-threaded example differ from a non-multithreaded
+one?
+
+Firstly, when we create the first context, we call fz_new_context
+as before, but the second argument should be a pointer to a set
+of locking functions.
+
+The calling code should provide FZ_LOCK_MAX mutexes, which will be
+locked/unlocked by MuPDF calling the lock/unlock function pointers
+in the supplied structure with the user pointer from the structure
+and the lock number, i (0 <= i < FZ_LOCK_MAX). These mutexes can
+safely be recursive or non-recursive as MuPDF only calls in a non-
+recursive style.
+
+To make subsequent contexts, the user should NOT call fz_new_context
+again (as this will fail to share important resources such as the
+store and glyphcache), but should rather call fz_clone_context.
+Each of these cloned contexts can be freed by fz_free_context as
+usual.
+
+To open a document, call fz_open_document as usual, passing a context
+and a filename; this context is bound to the document. All future
+calls to access the document will use this context internally.
+
+Only one thread at a time can therefore perform operations such as
+fetching a page, or rendering that page to a display list. Once a
+display list has been obtained however, it can be rendered from any
+other thread (or even from several threads simultaneously, giving
+banded rendering).
+
+This means that an implementer has 2 basic choices when constructing
+an application to use MuPDF in multi-threaded mode. Either he can
+construct it so that a single nominated thread opens the document
+and then acts as a 'server' creating display lists for other threads
+to render, or he can add his own mutex around calls to mupdf that
+use the document. The former is likely to be far more efficient in
+the long run.
+
+For an example of how to do multi-threading see doc/multi-threaded.c
+which has a main thread and one rendering thread per page.