The Joy of Supermongo
Hello, everyone! There is a new link for the Joy of Supermongo webpage, and you will
be redirected there shortly. I don't
know how much longer this page will be live, so please bookmark the new one.
Thanks!
Supermongo (or SM) is a very powerful analysis and plotting package.
Plus, it has a pretty nifty icon. There is a printed manual to SM, but
it states, "SM is still evolving slowly, and this documentation may not
be true, helpful, or complete."
Hopefully this page will be a little more useful. Combined with the manual
and the help files,
it should make SM relatively painless. If you catch any mistakes on this page,
please email me and let me know.
Getting Started
- Type sm in an xterm window. Like most UNIX apps (but unlike IRAF),
you can run SM from any directory. The software package will be kind enough to
greet you, saying, Hello [username], please give me a command. If a
blank window with an SM bar at the top does not open, type device x11 to
open it. This ensures that plotting commands will be directed towards that
window, and you can see the results.
- If you know the command you want help on, type help [command]. They
are helpful to varying degrees.
- If you have a keyword you want to search on, type apropos [word].
- Use the ! character to run UNIX commands from within SM. This is
especially useful to print, check your directory, and do other mundane tasks.
Basics of Analysis and Plotting
- Specify the data file you want to use with data [filename].
- Read in the data by read {name1 column# name2 column#}.
- Lines commented with a # at the beginning will not be
read.
- If SM seems to be choking on really bizarre things, make sure the
first line of the input file starts at the first column, even if
the line is commented out (starts with #). That will
usually solve the problem.
- If you want to specify which lines to read from, use lines start#
end#. To read from a certain line to the end, use lines
start# 0.
- If you want to force the format of the columns you are reading in,
use .s and .f to force the column to be a
string or a float, respectively. These are appended to the
column number, so it would go like this: read {galname
1.s blah 2.f}. galname is now a vector of
strings and blah is now a vector of floating-points.
Note that this is the only way I've ever been able
to read strings into SM.
name1 and name2 are now vectors of data points.
- Plotting the data:
- If you want the graph to scale with the data, use limits x
y, where x is the vector that defines the range of values
for the x-axis. If you'd rather specify your own range, use
limits xmin xmax ymin ymax, where xmin, etc., are all
numbers. A combination of these can also be used.
- To draw the box with the tickmarks, use box .
- The data can be displayed as points, or as lines connecting the points.
The commands are points x y and connect x y,
and plots y vs. x.
- Log scaling on axes:
- Maybe you don't want to plot y vs. x -- maybe you'd rather plot log10(y) vs. log10(x).
- Create the log vectors by lgx = lg(x).
- Use the ticksize command to tell SM that you will be
plotting in log space. ticksize -1 0 -1 0 will almost always have reasonable tickmarks.
- Then use the limits command to set the axis ranges, using log values. You can use
your log vectors, limits lgx lgy, or you can define them yourself. So if you want a
linear range of 0.01 to 10 on each axis, that's a log range of -2 to 1, and you would use limits -2 1 -2 1
- Use box to draw the axes. Note that the axes will have log scaling, and the labels will be the linear values, not log (so the large tickmarks are marked 0.01, 0.1, 1, and 10., not -2, -1, 0, 1).
- Plot the log vectors.
- Note that you will need to use ticksize 0 0 0 0 before you make your next plot in linear space, or else you'll get a crazy plot.
- Quick analysis:
To specify some variable, use define. To specify or create a vector, use
set. The various options of these are covered in the SM help files. The
list define command lists all the variables stored in the history, with
their values. list set lists all the vectors, with their lengths.
- Define a variable by define name value. When referring
to the variable later, even if you are changing its value, refer
to it as $name.
- Create a vector by set name = expression. To create a vector
of a range of numbers, use set name = start, end,
increment. To create a vector of a list of values (or of a
list of vectors, which will be useful later on), use set
name = { list numbers or whatever with spaces in between }
- Note that indexing of vectors is like C++: a vector of 10 values has
indices 0, 1, . . . 8, 9. Referring to the second element in a vector
is done by name[1].
Formatting Plots
- Line types: The default line type is a solid line. Changing this is
done by ltype number, where number is between 0 and 6. (0 is the
solid line.)
- Point types: The default point type is an x. To get a real point, use
ptype 1 1. This has a decent help file.
- Colors: The default color is white. This can be changed by ctype color,
where color can be white, yellow, red, black, green, blue, or cyan. However,
if you are creating postscript files of your plots, but you want them to
be displayed on your screen in white, use ctype default so they
will print out okay.
- Labeling axes: xlabel Stuff and ylabel More Stuff are used
to label the x and y axes. toplabel Titled Stuff is used to place
a label above the graph. Text formatting
(Greek letters, subscripts, etc.)
can be done using LaTeX formatting. Make sure Tex_strings is set to 1 in your .sm file.
- Making a legend:
- Use relocate x y to position the cursor at x, y in
the plot, where x and y are the axes of the graph. To draw a line in the
style of the current ltype from there
to another point, use draw xnew ynew.
- Use dot to place one point in the style of the current
ptype at the location specified by relocate.
- Inserting text at the location is done with label Stuff, and
again can be formatted using LaTeX.
- To label the upper right-hand corner of the graph with the date and file name,
use identification.
- Sizes: the expand command is used to change the font size based on the
default. expand 1.25 increases the font size by 25%.
Printing Plots and the Device Tool
The device tool is used to switch from the graph window and create postscript
files and gifs. Writing your commands in macros makes this infinitely
easier; I'll get to that further below.
- Upon starting SM, you should be in device x11, which is the screen.
- To create a postscript file of your plot, the command is generally
device postencap filename.ps. (Note: this may differ from version
to version of SM.) Until you switch back to device x11, all
your plotting commands will be directed towards the postscript file. Once
you have switched back to x11 (but NOT before!!) you can then view and
print the postscript file.
- Creating a gif is done the same way, using device gif,
device smallgif, and device blackgif. An important note
is that the gif file will preserve line colors, but not line types from
your plot.
- If you want to display more than one graph on a page, use the window
command. This is called by window nx ny x y, where nx and ny are the
number of columns and rows, respectively. x and y indicate the position of
the current plot, where 1 1 is the lower left-hand corner of the screen.
For a given page, the nx and ny numbers will be the same for each plot, but x and
y will change. Since SM saves all its parameters, you may have to use
window 1 1 1 1 to get back to a full-sized plot.
- If you want to create more than one plot at once, there are several
ways of doing this. The simplest is to have each plot on a separate page.
This is done using the page command. This is slightly tricky,
because if you are outputting to a device (for example, a postscript file),
it will start a new, clear page. But if you are plotting to the screen, it
will not erase the first graph before plotting the second one. So in general,
it is good to stick an erase command after (definitely not before!
d'oh) the page
command. Secondly, if you are running a macro and want to actually look at
the graph before it flashes to the second (or third . . . ) graph, stick a
define blob ? command in before the page. This will
make SM halt and
prompt you for a value for blob before it continues.
Macros, Making them from Outside SM, and Using Them
Macros are simply a list of all the commands you want SM to perform. The macro
file should have a '.sm' extension.
- Writing the macro: The first line should have the macro name, and
should NOT be indented. All subsequent lines should be indented. Use the
# character to comment out any lines.
- Reading in the macro: macro read name.sm
- Executing the macro: name (or whatever the first line of the
macro is. Note that it needs not be the same as the filename of the macro.)
- Sending parameters to a macro: In the first line of the macro, indicate
the number of parameters you will pass to the macro. (Example: itsname
2 will pass two parameters to the macro when you execute it.) In the
macro, refer to the input variables as $1, $2, etc. When you call the macro,
you can either list the values in your call statement, or wait for SM to
prompt you for the values.
Histograms are a pretty handy way of looking at the distribution of your data, but
unfortunately they are fairly confusing in SM, and not just because you end up using the histogram
command twice. The simplest way is to do an example: Let's say that we're curious about the
distribution of the number of goals that Michigan allows per game, because Montoya's either on
or off, so we suspect that
it might be bimodal. The macro is
here, and I'll go through it step by step:
- Start with an unsorted list of data points -- I did this by reading in
goals.dat, which is just a single-column list of goals allowed,
and named the vector goals.
- Decide what range of bins you want to cover. If you want to cover all of the data,
and you're just doing, say, integers, find the minimum and maximum values with vecminmax
goals min max .
- Create the array of bin points with set bins = $min, $max, 1 .
- Say you have data that's not just integers, and you want ten bins. The best
way to handle that is set bins = $min, $max, (($max - $min)/10) . And so
on. You get the picture.
- Now we sort the data, and save it with another vector. This is done with set
agoals = histogram(goals:bins) .
- As seen, the syntax is histogram(data you're sorting :
bins to sort into) .
- agoals is the sorted data, such that agoals[0] gives the number of
occurences (or data points in) of bins[0] -- in this case, how many games in which
0 goals were allowed.
- When setting limits, you want the number of goals allowed on the x-axis, and the sorted number of
games on the y-axis -- so limits bins agoals .
- To plot the data, we need to use the histogram command again --
histogram bins agoals .
And I'll be damned! We always joked it was a bimodal distribution, but it kind of
is! And we have proof! Aw, shit, now I'll have to do this again at the end of this season . . .
Programming and Syntax
The if seems to be the most useful. It is most often used when creating
a vector, plotting data, or doing
something else.
- When creating a vector using a conditional, use set name = expression
if (conditional). Like C++, use == when comparing equalities,
!= for 'not', && for 'and', and || for 'or'.
(Help
logical goes into more detail.)
- If you don't want to plot all your data points, use points x y if
(expression). (This also works for connect)
Bear in mind that just because the points are not all
displayed does not mean that the points are not all there. Setting limits
on x and y, or line-fitting to x and y, will do so to ALL of your points, not
just the ones you selected.
- When using the if statement in a programming block, the syntax is if
(logical expression) {list of statements}. If-else statements can also
be done by if (logical expression) {statements} else {other statements}.
You can also use do loops, and they're pretty simple. See the help
file.
Examples
I found a macro I wrote for an assigment that uses many of the commands explained
in this page. The macro was used to plot rotation velocities
or something, I don't know what. (This is why you should always comment what you do!)
The data is from the rotation curve of a galaxy, and I
plotted various things.
Function Fitting
- Line-fitting:
The quickest and simplest function-fitting is a line fit to a set of data. This is
done using lsq x y [x2 y2 rms], where the commands inside [ ] are not
necessary. This fits the line y2 = $a*x2 + $b to x and y.
- If you can linearize the equation you are fitting do, whether by logarithms
or whatever, DO SO. Trust me. Defining an equation to fit to in SM is incredibly
nasty. Read the help file for linfit and you'll see what I mean.
- If you've read the help file and are still crazy enough to try a least squares
linear fit, here goes. We are basically finding the constants in the equation you
are fitting to. linfit is called with four values.
- The first is actually
a vector of vectors -- this is the list of vectors that you are fitting to. Note that
if you want to have a constant by itself in the equation, you must create a
vector of 1's, the same length as the others you are fitting to.
(There are lots of creative
and equally efficient ways of doing that.) If we are fitting Y = X + Z + W + Constant,
then create your vector set vec = {X Z W Ones}, and they can be referred
to as vec[0], etc. This must be done even if you only have one vector
you are fitting to.
- The second value in the linfit call is the
vector we are fitting to -- in this case, Y.
- The third vector stores the constants from
the fit. const[0] is the constant for the vec[0] vector, and
so on.
- The fourth vector is the variance of each value found.
So to fit the equation Y = X + Z + W + Constant, the steps are:
set one = 0*X + 1
set vec = {X Z W one}
linfit vec Y const var
The equation is now Y = const[0]*X + const[1]*Z + const[2]*W + const[3]
Back to my main page
Created by Rebecca Stanek, 2001.
Last modified 1/5/07.