Working with theoretical astrophysics, starting with an introduction to programming in IDL
We are going to start our study of stellar astrophysics theory and observations learning how theorists model various aspects of the stars. These are mathematical models, and our understanding of how stars "work" has increased dramatically (an understatement) with the invention of fast computers. To take on the role of a theorist, we must learn how to program computers.
"That's just great!" you may think, "But what language should I be programming in?" Good old Fortran was probably the best language for astronomers (and there are probably thousands upon thousands of Fortran programs still in use) since it basically was used as a huge calculator. For various rasons, Fortran has fallen out of favor. In my personal opinion, the C programming language does not have enough flexibility in passing variables between procedures and functions (discussed later). For this course we are going to use a language called "Interactive Data Language" or IDL. Its main drawback is that it has to be licensed, and that license is expensive. Not all institutions will have IDL available. But, it is similar enough to other languages, that the knowledge you gain here is easily transferred. Plus, it is becoming the language of choice for astronomers.
IDL is an exceedingly powerful language; however, we will just glimpse a small part of IDL's power in this course. We need a sophisticated scientific calculator: one that can do the same task multiple times, call in other routines, perform statistical tests, etc. We will not need IDL's plotting capability (although it would come in handy), nor its ability to process astronomical images. Perhaps the best part of IDL is that it is a command-line language. You enter IDL, you type in a command, you hit "return," and out pops an answer. This makes learning the language easier and testing various options in programming more transparent.
In this exercise, we will be learning the vocabulary of programming, and how to get the programming code to mindlessly do what we want it to do, over and over again. We will start with some easy, demonstrative examples, and eventually work our way up to quite complex examples. Along the way, we need to learn how IDL treats data types, variables, strings, operands, plus more.
IDL data types
IDL supports 11 different numeric data types (plus an "undefined" type) that can be manipulated by operators, functions, procedures, and programs. The 4 data types we will be using are:
int : [16 bits] Integers from -32768 to +32768 (2562)
float : [32 bits] Single-precision, floating-point, with six significant places. Magnitudes range from 10-38 to 1038 on machines with IEEE arithmetic.
double : [64 bits] Floating-point with approximately sixteen decimal digits.
string : A sequence of 0 to 32,767 characters
Variables : Variables are named "repositories" that contain data with a type and structure. In IDL, variables can be scalars (single numbers), arrays of up to eight dimensions, or structures composed of scalars, arrays, and/or other structures.
IDL in Interactive Mode
IDL works directly on the command line when you hit return. This makes it easy to try new statements and statement sequences. The "up" cursor arrow brings back earlier commands.
START IDL
On most Linux systems, just type idl .
You will see:
IDL Version 5.5a (linux x86). (c) 2001, Research Systems, Inc.
Installation number: 97443-1.
Licensed for use by: University of Washington
IDL>
You are ready to go! When learning a new language, the first thing the programming student does is tell the world hello. So, on the command line, type in the following (the comma is REQUIRED):
IDL> print, 'hello world'
What did the computer tell you?
IDL is a high-level language, but it must operate within certain constraints, and we must provide those constraints (on the other side, IDL constrains us, as you will find out). Through the program source code, we instruct the compiler how to define each variable in the program. When the program is run, the variable type remains fixed, unless we redefine it.
Assigning values to variables and manipulating those variables are at the heart of a successful program. In programming, we have the variable name that will be assigned a value on the left (ALWAYS) of the equal sign, and the value that will be assigned to it on the right of the equal sign.
Here are some simple examples. Type in the information as shown:
IDL> var = 1.0
IDL> print, var
IDL> my_var_123 = 1
IDL> print, my_var_123
IDL> boltzmann = 1.380658e-16
IDL> print, boltzmann
What answers did you get?
You can verify the type of the variable through the command help. Type the following:
IDL> help, var, my_var_123, boltzmann
What was the response?
Keeping Track of Variables
IDL and most programming languages allow you to name your variables just about anything you want, but not everything. IDL has some words that are reserved just for it, and there are some invalid names.
IDL reserved words: and, begin, case, common, do, else, end, endcase, endelse, endfor, endif, endrep, endwhile, eq, for, function, ge, goto, gt, if, le, lt, mod, ne, not, of, on_ioerror, or, pro, repeat, then until, while, xor.
Built-in function names should not be used as variable names: sin, cos, tan, intarr, to name just a few. There are a lot of them, and if you use one, you will get strange results. You can check the validity of a variable name by typing (question mark needed, then a space, then the variable name):
IDL> ? sex
IDL> ? sin
If the name is already defined as a function, the entry SIN function will appear in the list of search matches.
Variable names in IDL must begin with a letter, and may be up to 128 characters in length. Valid characters are a-z, the digits 0-9, and the underscore character (note: no dash or any other characters). Case is ignored, and there cannot be any spaces.
Which of the following variable names are invalid for IDL use, and why?
boltzmann, big_g, 123_climb_a_tree, print
tommy&greg, My First Variable Name
While IDL allows up to 128 characters in a variable name, why would one ever want to type that long of a word, over and over again? On the other hand, if you are writing a large program, then naming the variables a, aa, aaa, aaaa, b, bb, bbb, bbbb, etc. would make for extremely confusing code for anyone else, and for you after a week or so has gone by! Suggestion: come up with names that mean something about what they represent. For example:
boltzmann = 1.380658e-16
big_g = 6.673e-11 ;m^3 kg^-1 s^-2
lightspeed = 3.0e+8 ; in meters per second
freq = 4.0e+14 ; frequency
wave = lightspeed/freq ; calculate wavelength
You have defined boltzmann, big_g, lightspeed, freq, and wave to be definite values.
Arrays
Arrays in IDL provide a fast, efficient way to manipulate data IDL can handle arrays from one to eight dimensions.
Try this little ditty (which also introduces you to the plot function):
IDL> x = findgen(201) * 0.1
IDL> y = sin(x)
IDL> plot, x, y
Here, a floating-point array (x) is created with 201 elements. The first element is equal to 0.0, the last is equal to 20.0, and the step between them is 0.1. A second floating-point array (y) is created, where each element is equal to the sine of the corresponding element in the x array. One could create an integer array using the indgen() function, a double-precision array using the dindgen() function, and a string array using sindgen().
You will be creating arrays in this course similar to the following (no need to type the expression after the semicolon, I put that there as a comment to have you stop and write down what happened; you will make similar comments in all of your programs):
IDL> x = [0, 1, 2, 3, 4]
IDL> help, x ; What do you get?
IDL> print, x ; What do you get?
NOTICE THAT YOU USED SQUARE BRACKETS [ ] to define the array elements.
Multidimensional arrays can be created by using nested square brackets:
IDL> x = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
IDL> help, x
What did you get? What would you need to do to get a floating-point array?
IDL> print, x ; What did you get?
Programming languages differ in the way they read arrays. IDL and Fortran use what is called column-major format, meaning that consecutive array elements are stored sequentially in memory, with the first array dimension (the column dimension) varying the fastest. Let's look at a two-dimensional array with 4 columns and 4 rows:
A 0,0 |
A 1,0 |
A 2,0 |
A 3,0 |
A 0,1 |
A 1,1 |
A 2,1 |
A 3,1 |
A 0,2 |
A 1,2 |
A 2,2 |
A 3,2 |
A 0,3 |
A 1,3 |
A 2,3 |
A 3,3 |
This array is stored sequentially in memory in the following order:
A[0,0], A[1,0], A[2,0], .... A[0,2], A[1,2], .... A[1,3], A[2,3], A[3,3]
In IDL it is common to read or write entire arrays, and it is important to remember how they are indexed and which index changes the fastest. In IDL and Fortran, you read and write an array with dimensions [ ncolumns, nrows ]. In the C programming language, which uses row-major format, the last array dimension varies the fastest, so the array here would be dimensioned ( nrows, ncolumns ).
CRITICAL: You must remember that 0 is the first index of a row and a column of an array, not 1. Let's say you created an array of values. If you then worked with indexing of 1, 2, 3, 4 (let's say in a 4 x 1 array), you would be leaving out the first element (index of 0) and who knows what you'll get when you try for A 4 . To state this formally: Array indices (also known as subscripts ) in IDL are zero-based and positive. * (On the other hand, Fortran's array indices are one-based.)
Array indices can be specified in two different ways for retrieving the values in the array:
First way: array[index]
Second way: (array_expression)[index]
Try this as a demonstration of the first way:
IDL> arr = indgen(10) * 3
IDL> print, arr, format='(10i6)'
What did you get? What does format do? What are the "10" and the "6" specifying?
IDL> index = 5
IDL> print, arr[index] ;What did you get?
Here's a demonstration for the second way, where the particular element in the array has its value changed:
IDL> print, (arr * 10)[index] ;What did you get?
IDL> print, arr
Were the other elements affected? Why not? What would you have to do to change all of the values in the array?
Try this:
IDL> index = -1
IDL> print, arr[index] ;What happened? Why?
Indices can also be specified in array form, by a specific scalar value, through a range, or all:
IDL> index = [0, 2, 4, 1]
IDL> print, arr[index]
IDL> print, arr[6]
IDL> print, arr[0:3]
IDL> print, arr[*]
IDL> i = 3
IDL> print, arr[i-1:i+1]
Multidimensional arrays are indexed using a similar syntax. Examples:
IDL> arr = indgen(4, 4) * 4
IDL> print, arr
IDL> print, arr[0,0] & print, arr[3,3] ;What does the & allow you to do?
IDL> print, arr[*,0] ;All elements in the first row
IDL> print, arr[0,*] ;All elements in the first column
IDL> print, arr[0:2, 0:2] ;Over 2 dimensions: 9 elements
Operators and arrays – use as you would with simple scalars:
IDL> a = 2.0
IDL> b = 3.0
IDL> print, a + b
IDL> c = [1, 3, 5, 7]
IDL> d = [2, 4, 7, 8]
IDL> print, c + d
IDL> print, c * d
IDL> e = c / d
IDL> print, e ;What is e?
IDL> help, e ;Find out. Is this what you expected?
If any one of the variables in an expression is an array, the result will be an array.
IDL> f = 2
IDL> g = f * e
IDL> print, g & help, g ;What did you get?
You may, at some point in your future as a theoretical astrophysicist, need to work with arrays of different numbers of elements and dimensions. Or, you may have arrays that have the same number of elements, but different dimensions. Be sure to refer to an IDL manual before trusting your results.
An example of same number of elements but different dimensions:
IDL> a = [2.0, 4.0, 6.0, 8.0]
IDL> b = [[2.0, 4.0], [6.0, 8.0]]
What would be an example of 2 arrays with different number of elements and different dimensions?
IDL and Matrices
IDL can work with matrices in a “normal” manner – [row, column] format. It does that by including two operators that provide matrix multiplication functionality. If you have occasion to work with matrices, be sure to consult an IDL manual.
Relational and Boolean Operators
These operators form the guts of programming.
Relational operators in IDL: eq ne le lt ge gt
Boolean operators: not and or and xor
Relational operators produce a numeric result. Type the following:
IDL> a = 10.0
IDL> b = 20.0
IDL> help, a gt b
IDL> help, a lt b
What is true represented by? What is false represented by?
Here are some ways that relational operators are used in programming:
IDL> if (a lt b) then print, 'True' ;note the use of single quotes
IDL> if (a gt b) eq 0B then print, 'False'
IDL> if (a lt b) eq 1B then print, 'True'
In astrophysics, we must deal with extremely large or extremely small floating-point variables, many times choosing double-precision for more accuracy. However, there are two caveats: round-off error and machine precision. For example, you may wish to test the agreement between two numbers for the convergence of a model. Let's say one solution is found where a = 3456.0385967 and another solution is b = 3456.0385969, and you specify if (a eq b) then you have some other action take place. In this case, you may never get satisfaction. Here's one way to avoid this:
IDL> tolerance = 1.0e-6
IDL> if (abs(a - b) lt tolerance) then $
IDL> print, 'Solution found' ;What does the $ let you do?
Take a look at the following examples and consider what is being tested and what the result(s) is:
IDL> x = 1.1
IDL> if (x gt 0.0) and (alog(x) gt 0.0) then print, 'Log is positive'
IDL> x = 1.0
IDL> if (x gt 0.0) and (alog(x) gt 0.0) then print, 'Log is positive'
Try this one:
IDL> x = 0.0
IDL> if (x gt 0.0) and (alog(x) gt 0.0) then print, 'Log is positive'
What happened? Did IDL go ahead and test the second condition even though the first was false?
You do not want to have something like this happen in your program! What you get is a program that runs all the way through and when you look at your data, anticipating a cosmic-shattering discovery, what you find are error messages and/or columns and columns of NaN's. Not good.
Let's add another test that avoids this possibility, at least for the previous example:
IDL> x = 0.0
IDL> if (x gt 0.0) then if (alog(x) gt 0.0) then print, 'Log is positive'
You will need to test for the minimum of two scalars, and choose an action. Here's one way:
IDL> x = 10.0
IDL> y = 5.0 ;You may not know this, as results are produced in the program itself
IDL> if (x lt y) then z = x else z = y
IDL> print, z
Here's an IDL way of doing the same thing:
IDL> z = (x lt y) ? x : y
IDL> print, z
Now do assignment 1.
* Gumley, Liam E., 2001, Practical IDL Programming (Morgan Kaufmann Publishers, California)