Switch to reading content streams on the fly during interpretation.

Previously, before interpreting a pages content stream we would load it entirely into a buffer. Then we would interpret that buffer. This has a cost in memory use. Here, we update the code to read from a stream on the fly. This has required changes in various different parts of the code. Firstly, we have removed all use of the FILE lock - as stream reads can now safely be interrupted by resource (or object) reads from elsewhere in the file, the file lock becomes a very hard thing to maintain, and doesn't actually benefit us at all. The choices were to either use a recursive lock, or to remove it entirely; I opted for the latter. The file lock enum value remains as a placeholder for future use in extendable data streams. Secondly, we add a new 'concat' filter that concatenates a series of streams together into one, optionally putting whitespace between each stream (as the pdf parser requires this). Finally, we change page/xobject/pattern content streams to work on the fly, but we leave type3 glyphs using buffers (as presumably these will be run repeatedly).
author: Robin Watts <robin.watts@artifex.com> 2012-05-07 11:30:05 +0100
committer: Robin Watts <robin.watts@artifex.com> 2012-05-08 15:14:57 +0100
commit: 636652daee46a9cf9836746135e3f9678db796ec (patch)
tree: 110e78a0ffcb4a873088c92864ff182d783fdbc3 /pdf/pdf_repair.c
parent: 2433a4d16d114a0576e6a4ff9ca61ae4f29fdda0 (diff)
download: mupdf-636652daee46a9cf9836746135e3f9678db796ec.tar.xz
1 files changed, 2 insertions, 13 deletions
diff --git a/pdf/pdf_repair.c b/pdf/pdf_repair.c
index a51b9631..27846855 100644
--- a/pdf/pdf_repair.c
+++ b/pdf/pdf_repair.c
@@ -195,6 +195,7 @@ pdf_repair_obj_stm(pdf_document *xref, int num, int gen)
 	}
 }
 
+/* Entered with file locked, remains locked throughout. */
 void
 pdf_repair_xref(pdf_document *xref, pdf_lexbuf *buf)
 {
@@ -389,19 +390,7 @@ pdf_repair_xref(pdf_document *xref, pdf_lexbuf *buf)
 			/* corrected stream length */
 			if (list[i].stm_len >= 0)
 			{
-				fz_unlock(ctx, FZ_LOCK_FILE);
-				fz_try(ctx)
-				{
-					dict = pdf_load_object(xref, list[i].num, list[i].gen);
-				}
-				fz_always(ctx)
-				{
-					fz_lock(ctx, FZ_LOCK_FILE);
-				}
-				fz_catch(ctx)
-				{
-					fz_rethrow(ctx);
-				}
+				dict = pdf_load_object(xref, list[i].num, list[i].gen);
 				/* RJW: "cannot load stream object (%d %d R)", list[i].num, list[i].gen */
 
 				length = pdf_new_int(ctx, list[i].stm_len);
author	Robin Watts <robin.watts@artifex.com>	2012-05-07 11:30:05 +0100
committer	Robin Watts <robin.watts@artifex.com>	2012-05-08 15:14:57 +0100
commit	636652daee46a9cf9836746135e3f9678db796ec (patch)
tree	110e78a0ffcb4a873088c92864ff182d783fdbc3 /pdf/pdf_repair.c
parent	2433a4d16d114a0576e6a4ff9ca61ae4f29fdda0 (diff)
download	mupdf-636652daee46a9cf9836746135e3f9678db796ec.tar.xz