Process sampled data and perform simple calculation to by pass Matlab?

kdar · 2012-05-11 06:34:56

I have files that contain large number current samples (like 300,000 for example). I need them to performing simple calculations (for example to calculate energy).

Before, I used Matlab to load all those samples into array and to do all calculations there.

However... that process is kind of slow. What can I use to simply this and replace Matlab by some script?
Can I use perl or php to perform this operation on set of sample files to spit out final calculation values into new file?

Last edited by kdar (2012-05-11 13:21:05)

fschiff · 2012-05-11 09:48:56

off hand thought of the programming language J, which is an APL derivative (but uses no Greek letters).

kdar · 2012-05-11 13:22:02

I would like to write a simple script to run from linux (without having to be dependent on Matlab)

-- I will check up on J, I never heard of it before. You can run it from terminal just like you would perl or php script in Linux?

Last edited by kdar (2012-05-11 13:22:43)

jakobcreutzfeldt · 2012-05-11 14:15:31

Perhaps you can convert your Matlab script to an Octave script? If you're not doing anything fancy, you may not have to change anything, in fact.

What about a simple Awk script? Note that I have no idea what Awk's performance is like on huge datasets.

fschiff · 2012-05-11 14:42:47

From the website
http://www.jsoftware.com/

"J is a modern, high-level, general-purpose, high-performance programming language. J is portable and runs on Windows, Unix, Mac, and PocketPC handhelds, both as a GUI and in a console. True 64-bit J systems are available for XP64 or Linux64, on AMD64 or Intel EM64T platforms. J systems can be installed and distributed for free."

Basically its APL but better written.

edit: I am a fan of J but have never really used it. If the existing Unix tools are adequate (awk, bc, dc, etc.) you should use them, otherwise this might be an option, unless someone has another programmatical solution.

Last edited by fschiff (2012-05-11 14:50:06)

kdar · 2012-05-11 18:35:25

jakobcreutzfeldt wrote:

Perhaps you can convert your Matlab script to an Octave script? If you're not doing anything fancy, you may not have to change anything, in fact.
What about a simple Awk script? Note that I have no idea what Awk's performance is like on huge datasets.

Octave can be just run from terminal, am I right? I haven't used it to full extend.

Last edited by kdar (2012-05-11 18:35:38)

jakobcreutzfeldt · 2012-05-12 09:31:59

kdar wrote:

Octave can be just run from terminal, am I right? I haven't used it to full extend.

Correct

/dev/zero · 2012-05-12 10:15:20

Example:

$ cat cmds
(1:3)'
$ octave -q cmds
ans =

   1
   2
   3

However, I would mainly prefer octave only if I was doing a lot of linear algebra, such as finding matrix inverses and eigenvectors.

It sounds like the calculations you have in mind would be fairly simple, so I would be tempted to go with jakobcreutzfeldt's suggestion and use awk. Awk would be extremely fast and your scripts would then work on any Unix-like machine without needing to install Octave/Matlab.

Trilby · 2012-05-12 11:44:35

+1 for awk.

Yurlungur · 2012-05-13 21:06:57

I'm surprised no one has suggested numpy and scipy, which are modules for python 2. They have syntax that's similar to mattlab, they're very fast, and python is useful in other circumstances and easy to learn.

I did something similar for spectral data. Here's an example of what I did in python:

#!/usr/bin/env python

#This program goes through rayleigh line data and finde the mean shift
#in nanometers and the standard deviation

import sys, os
import numpy as np
import scipy as sp
import scipy.optimize as op
import time

ray = []
filenames = []
line = 633

def rs(wavelength,laser):
    return ((float(1)/laser)-(float(1)/wavelength))*(10**7)

def main(argv): #Goes through a file and finds the peak position of the rayleigh line
    f = np.loadtxt(argv).transpose() #opens the file
    maxi = np.amax(f[1]) #Finds the value of hte peak of the rayleigh line
    intensity = [f[1,i] for i in range(len(f[1]))] #extrants the array into a list
    indi = intensity.index(maxi) #Finds the index of the rayleigh line
    ray.append(f[0,indi])
    filenames.append(str(argv))

# Goes through each file named in the CLI call and applies the main function to it
for filename in sys.argv[1:]:
    main(filename)

# Use numpy for some basic calculations
mean = np.mean(ray)
StandardDeviation = np.std(ray)
median = np.median(ray)
variance = np.var(ray)

ramanshift = [rs(ray[i],line) for i in range(len(ray))]
rsmean = np.mean(ramanshift)
rsSD = np.std(ramanshift)
rsmedian = np.median(ramanshift)
rsvariance = np.var(ramanshift)

tname = str(time.asctime())

# Write all calculations to a file
output = open('rayleigh_'+tname+'.dat','w')
output.write('#The files used for this compilation are:\n')
for i in range(len(filenames)):
    output.write('#'+filenames[i]+'\n')
output.write('The wavelengths of the Rayleigh line are (in nm):\n')
for i in range(len(ray)):
    output.write(str(ray[i])+'\n')
output.write('The raman shifts of the rayleigh line for '+str(line)+'nm are (in rel. cm^(-1):\n')
for i in range(len(ray)):
    output.write(str(ramanshift[i])+'\n')
output.write('Mean = '+str(mean)+'nm, or '+str(rsmean)+' rel. cm^(-1)\n')
output.write('Standard Deviation = '+str(StandardDeviation)+' nm, or '+str(rsSD)+' rel. cm^(-1)\n')
output.write('Median = '+str(median)+'nm or, '+str(rsmedian)+' rel. cm^(-1)\n')
output.write('Variance = '+str(variance)+'nm or, '+str(rsvariance)+' rel. cm^(-1)\n')
output.close()

Last edited by Yurlungur (2012-05-13 21:14:54)

kdar · 2012-05-14 21:50:59

I decided to use perl and some bash to run through several files, but I will give above mentioned a try too.

Arch Linux

#1 2012-05-11 06:34:56

Process sampled data and perform simple calculation to by pass Matlab?

#2 2012-05-11 09:48:56

Re: Process sampled data and perform simple calculation to by pass Matlab?

#3 2012-05-11 13:22:02

Re: Process sampled data and perform simple calculation to by pass Matlab?

#4 2012-05-11 14:15:31

Re: Process sampled data and perform simple calculation to by pass Matlab?

#5 2012-05-11 14:42:47

Re: Process sampled data and perform simple calculation to by pass Matlab?

#6 2012-05-11 18:35:25

Re: Process sampled data and perform simple calculation to by pass Matlab?

#7 2012-05-12 09:31:59

Re: Process sampled data and perform simple calculation to by pass Matlab?

#8 2012-05-12 10:15:20

Re: Process sampled data and perform simple calculation to by pass Matlab?

#9 2012-05-12 11:44:35

Re: Process sampled data and perform simple calculation to by pass Matlab?

#10 2012-05-13 21:06:57

Re: Process sampled data and perform simple calculation to by pass Matlab?

#11 2012-05-14 21:50:59

Re: Process sampled data and perform simple calculation to by pass Matlab?

Board footer