You are not logged in.

#1 2012-02-23 00:23:11

Registered: 2011-10-20
Posts: 16

Python help with parsing test files

I'm trying to write a script to parse through text files in order to enter information into a sqlite db, but I'm not sure the best way of doing it. Line 1-4 are easy, as they have the same formatting everytime, however after that things are broken down into blocks with a heading over it like this:

line 1

possible random unneeded data

heading 1

heading 2

Under each heading is an undetermined amount of information, it could be 5 lines or 500 lines, I have no idea. What's the best way of putting all the lines between the first heading and the second heading into a list where I can play them, or something like that. Note, the name of each heading is fixed so I don't have to worry about that. Hopefully this is clear enough, I'm sorry I can't provide an actual example as this is something I'm using at work. Any and all help is appreciated.


#2 2012-02-23 02:54:32

rockin turtle
From: Montana, USA
Registered: 2009-10-22
Posts: 227

Re: Python help with parsing test files

This problem could probably be solved easier with sed or awk etc.

If you want a pure python solution, try this

#!/usr/bin/env python
# -*- coding: UTF-8 -*-

import sys, fileinput

def get_data( files, headings ):
	hdata, key, data = {}, '', []
	for line in fileinput.input( files ):
		line = line.rstrip( '\n' )
		if fileinput.isfirstline():
			if key: hdata[key] = data
			after_first_heading = False
			key = ''
			data = []
		if line in headings:
			after_first_heading = True
			if key: hdata[key] = data
			key = line
			data = []
		elif after_first_heading:
			data.append( line )
	if key: hdata[key] = data
	return hdata

mydict = get_data(sys.argv[1:], ['heading 1', 'heading 2', 'heading 3'])
for k, v in mydict.items():
	print( '{}: {}'.format( k, v ))

call it with your text files as arguments

$ scriptname file1.txt file2.txt ...


#3 2012-02-23 12:08:51

Registered: 2011-10-20
Posts: 16

Re: Python help with parsing test files

Thanks, I'll try this out and let you know. Unfortunately, I'm on Windows at work so no sed/awk sad


Board footer

Powered by FluxBB