2012/06/27

Rant: Format This!

Submitting papers for publication is a painful process in many many many ways.  One of the most common modes of torture is having to reformat your manuscript from one set of guidelines to another.  Here I feature one shiny bit of ludicrousness that really makes me wonder where journal editors priorities are.

Math Equations and DOCX

If your manuscript is or will be in DOCX and contains equations, you must follow the instructions below to make sure that your equations are editable when the file enters production.
If you have not yet composed your article, you can ensure that the equations in your DOCX file remain editable in DOC by enabling “Compatibility Mode” before you begin. To do this, open a new document and save as Word 97-2003 (*.doc). Several features of Word 2007/10 will now be inactive, including the built-in equation editing tool. You can insert equations in one of the two ways listed below.
If you have already composed your article as DOCX and used its built-in equation editing tool, your equations will become images when the file is saved down to DOC. To resolve this problem, re-key your equations in one of the two following ways.
  1. Use MathType to create the equation. MathType is the recommended method for creating equations.
  2. Go to Insert > Object > Microsoft Equation 3.0 and create the equation.
If, when saving your final document, you see a message saying “Equations will be converted to images,” your equations are no longer editable and PLoS will not be able to accept your file.

Seriously folks.  It's 2012.  Let *.doc and Microsoft Equation 3.0 die already.  While your at it, let's figure out a universal formatting guideline to submit with.  All of science will thank you.

2012/06/13

Making rApache load rJava

Here at work I've been in the business of developing webapps using R as the backend computational framework.  The list of parts to get this running is pretty lightweight, just:
I'm not going to cover how to set these things up here, there is pretty good documentation around the web and on rApache's site.  Instead, I'm going to talk about a hair pulling setback I encountered early on.

Problem

R scripts run behind rApache cannot load rJava without throwing an HTTP 500 error

Details

Specifically, if you look at the error_log file you see something like the following:
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: dyn.load(file, DLLpath = DLLpath, ...)
  error: unable to load shared object '/usr/local/lib64/R/library/rJava/libs/rJava.so':
  libjvm.so: cannot open shared object file: No such file or directory
Error: package 'rJava' could not be loaded

Running the same R script from
  • a user login session ... no problem.
  • behind PHP (via a system() call) ... no problem.

Suffice it to say, this had me really really stumped.  Stumped enough to give up temporarily and settle with calling R code that needed rJava via a PHP-to-shell intermediary.  Of course, that got confusing and unscalable quite quickly, forcing me to find a real solution.

So I started digging and found one unanswered post on the rApache Google Group relating to this problem dating back to 2010 (it's answered now, with my solution as detailed below).  Not helpful.

More digging produced this post, which pointed me in the direction of the LD_LIBRARY_PATH variable, which apparently you shouldn't mess with directly unless you want a lot of R pain.


Using the following one line test script:
cat(Sys.getenv()['LD_LIBRARY_PATH'], '\n')

I quickly determined that rApache was NOT setting this variable, or anything else defined in
R/etc/ldpaths

before creating an instance of R.

From the folks that work on RStudio, R needs this variable set before starting R for rJava to initialize correctly - i.e. be able to find libjvm.so.


So how do you do this in an Apache process?  I know that using a SetEnv directive in httpd.conf is a dead end.  Thankfully, folks at the Ubuntu forums found a way.

Solution

Here's my modification of the Ubuntu forum solution.


Step 1:
Add a file to:

/etc/ld.so.conf.d

called:

rApache_rJava.conf

with just a single line:

/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/

which happens to be the direct parent path to libjvm.so on my server.

Step 2:
As root, run:
/sbin/ldconfig

Step 3:
Restart Apache

Wrap-up

After all this rigamarole it appears that I can load packages that depend on rJava from within rApache - i.e.  lines like
library(rJava)

no longer complain and I'm not getting any more HTTP 500 errors as a result, which makes me happy for the moment.  How long this happiness lasts depends.  R scripts within rApache still don't see an LD_LIBRARY_PATH variable, but at least the parent Apache process knows where to find libjvm.so.