You are not logged in.
I have files that contain large number current samples (like 300,000 for example). I need them to performing simple calculations (for example to calculate energy).
Before, I used Matlab to load all those samples into array and to do all calculations there.
However... that process is kind of slow. What can I use to simply this and replace Matlab by some script?
Can I use perl or php to perform this operation on set of sample files to spit out final calculation values into new file?
Last edited by kdar (2012-05-11 13:21:05)
Offline
off hand thought of the programming language J, which is an APL derivative (but uses no Greek letters).
Offline
I would like to write a simple script to run from linux (without having to be dependent on Matlab)
-- I will check up on J, I never heard of it before. You can run it from terminal just like you would perl or php script in Linux?
Last edited by kdar (2012-05-11 13:22:43)
Offline
Perhaps you can convert your Matlab script to an Octave script? If you're not doing anything fancy, you may not have to change anything, in fact.
What about a simple Awk script? Note that I have no idea what Awk's performance is like on huge datasets.
Offline
From the website
http://www.jsoftware.com/
"J is a modern, high-level, general-purpose, high-performance programming language. J is portable and runs on Windows, Unix, Mac, and PocketPC handhelds, both as a GUI and in a console. True 64-bit J systems are available for XP64 or Linux64, on AMD64 or Intel EM64T platforms. J systems can be installed and distributed for free."
Basically its APL but better written.
edit: I am a fan of J but have never really used it. If the existing Unix tools are adequate (awk, bc, dc, etc.) you should use them, otherwise this might be an option, unless someone has another programmatical solution.
Last edited by fschiff (2012-05-11 14:50:06)
Offline
Perhaps you can convert your Matlab script to an Octave script? If you're not doing anything fancy, you may not have to change anything, in fact.
What about a simple Awk script? Note that I have no idea what Awk's performance is like on huge datasets.
Octave can be just run from terminal, am I right? I haven't used it to full extend.
Last edited by kdar (2012-05-11 18:35:38)
Offline
Octave can be just run from terminal, am I right? I haven't used it to full extend.
Correct
Offline
Example:
$ cat cmds
(1:3)'
$ octave -q cmds
ans =
1
2
3
However, I would mainly prefer octave only if I was doing a lot of linear algebra, such as finding matrix inverses and eigenvectors.
It sounds like the calculations you have in mind would be fairly simple, so I would be tempted to go with jakobcreutzfeldt's suggestion and use awk. Awk would be extremely fast and your scripts would then work on any Unix-like machine without needing to install Octave/Matlab.
Offline
+1 for awk.
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
I'm surprised no one has suggested numpy and scipy, which are modules for python 2. They have syntax that's similar to mattlab, they're very fast, and python is useful in other circumstances and easy to learn.
I did something similar for spectral data. Here's an example of what I did in python:
#!/usr/bin/env python
#This program goes through rayleigh line data and finde the mean shift
#in nanometers and the standard deviation
import sys, os
import numpy as np
import scipy as sp
import scipy.optimize as op
import time
ray = []
filenames = []
line = 633
def rs(wavelength,laser):
return ((float(1)/laser)-(float(1)/wavelength))*(10**7)
def main(argv): #Goes through a file and finds the peak position of the rayleigh line
f = np.loadtxt(argv).transpose() #opens the file
maxi = np.amax(f[1]) #Finds the value of hte peak of the rayleigh line
intensity = [f[1,i] for i in range(len(f[1]))] #extrants the array into a list
indi = intensity.index(maxi) #Finds the index of the rayleigh line
ray.append(f[0,indi])
filenames.append(str(argv))
# Goes through each file named in the CLI call and applies the main function to it
for filename in sys.argv[1:]:
main(filename)
# Use numpy for some basic calculations
mean = np.mean(ray)
StandardDeviation = np.std(ray)
median = np.median(ray)
variance = np.var(ray)
ramanshift = [rs(ray[i],line) for i in range(len(ray))]
rsmean = np.mean(ramanshift)
rsSD = np.std(ramanshift)
rsmedian = np.median(ramanshift)
rsvariance = np.var(ramanshift)
tname = str(time.asctime())
# Write all calculations to a file
output = open('rayleigh_'+tname+'.dat','w')
output.write('#The files used for this compilation are:\n')
for i in range(len(filenames)):
output.write('#'+filenames[i]+'\n')
output.write('The wavelengths of the Rayleigh line are (in nm):\n')
for i in range(len(ray)):
output.write(str(ray[i])+'\n')
output.write('The raman shifts of the rayleigh line for '+str(line)+'nm are (in rel. cm^(-1):\n')
for i in range(len(ray)):
output.write(str(ramanshift[i])+'\n')
output.write('Mean = '+str(mean)+'nm, or '+str(rsmean)+' rel. cm^(-1)\n')
output.write('Standard Deviation = '+str(StandardDeviation)+' nm, or '+str(rsSD)+' rel. cm^(-1)\n')
output.write('Median = '+str(median)+'nm or, '+str(rsmedian)+' rel. cm^(-1)\n')
output.write('Variance = '+str(variance)+'nm or, '+str(rsvariance)+' rel. cm^(-1)\n')
output.close()
Last edited by Yurlungur (2012-05-13 21:14:54)
Lenovo Thinkpad T420; Intel sandy bridge i7 2.7GHz; integrated graphics card; 4GB RAM; wifi; Arch; Xmonad WM
Offline
I decided to use perl and some bash to run through several files, but I will give above mentioned a try too.
Offline