You are not logged in.

#76 2005-03-29 21:31:42

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Pacman with db support?

add an hash to the binary-file -> flipped bits while transmission solved
add parity -> you can correct lot of faults
add backwardcompatibility -> never messup

Add complexity
Add issues with different platforms should they ever arise (different endianness).

Well, I guess I disagree, and I will leave it at that.  wink


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#77 2005-03-29 21:35:22

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: Pacman with db support?

mercy wrote:

add backwardcompatibility -> never messup

yeah, it's *that* easy... just go ahead and "add in" backwards compatability!

Offline

#78 2005-03-29 21:35:31

berkus
Member
From: Tallinn, Estonia
Registered: 2005-03-29
Posts: 65
Website

Re: Pacman with db support?

mercy wrote:

add an hash to the binary-file -> flipped bits while transmission solved
add parity -> you can correct lot of faults
add backwardcompatibility -> never messup

to fiddle out a good structure for that binary would be much easier/faster to implement and maintain in the future than any db-approach

you have all this already and its called metakit


keep in touch.

Offline

#79 2005-03-29 21:43:50

mercy
Member
Registered: 2004-04-24
Posts: 62

Re: Pacman with db support?

well.. everything  is somehow complex so thats no point at all :-)

oldschool bitmasks and you cant have any issues reading a file on any system

eg.

1. first 64 bit -> version
2. next 256 bit -> hash
3. next 64 bit -> bitcountoffollowingdata
4. next 8 bit -> type of data
5. next bitcountoffollowinggdata bit -> data

repeat 3-5 as often as needed

done smile

veryvery basic.. i guess lot of real programmers can "upgrade" that :-)

Offline

#80 2005-03-29 21:47:48

berkus
Member
From: Tallinn, Estonia
Registered: 2005-03-29
Posts: 65
Website

Re: Pacman with db support?

i3839 wrote:
phrakture wrote:

you got a link? I keep finding either shareware, or russian sites

$ pacman -Qi tdb
Description    : TDB is a Trivial Database

I peeked at tdb a bit, looks very good and simple. Just like what you'd use for sticking a lot of small files together wink

I had a compile problem however (in tdbtool.c i had to escape help text so it doesn't barf about invalid string constant).

<b>Upd</b> whats more important there's tdbtool command line utility you can use to build/query the db! smile


keep in touch.

Offline

#81 2005-03-29 21:51:22

berkus
Member
From: Tallinn, Estonia
Registered: 2005-03-29
Posts: 65
Website

Re: Pacman with db support?

mercy wrote:

1. first 64 bit -> version
2. next 256 bit -> hash
3. next 64 bit -> bitcountoffollowingdata
4. next 8 bit -> type of data
5. next bitcountoffollowinggdata bit -> data

this is nonsense. you can as well store all text as one single array and a separate fixed-record-size index and that would be a lot lot faster (your format needs seeks to find successive data).

and implementing such arbitrary binary format when something as simple as cdb/tdb can do exactly the same and its already written and (!!) tested code... i trust Andrew Tridgell to have his stuff working smile


keep in touch.

Offline

#82 2005-03-29 22:09:27

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: Pacman with db support?

mercy, you either do not understand or are like 80 years old - fixed size, binary records are a thing of the past.  It is not scalable, and that is the key point to this whole "pacman needs a db" argument - that the current implementation isn't scalable.

you seem to be one of those "why use Z when I can just write it myself" people.  There are a huge amount of data storage libraries out there... that's what we're talking about.  We're not talking about running a mysql database backend which pacman uses... we're talking about some C libraries which happen to read formatted files... it can't get any simpler.

what you suggest... it's just flat out not scalable... if anything changes you totally annihilate backwards compatability.  Show me *any* program which does this... show me *any* standard which suggests this... you cannot.  HTTP is a long standing standard, which is based on delimiters (newlines), xml is based on formatting, there is nothing you can find which makes fixed binary data formats look like a good idea.

Offline

#83 2005-03-29 22:20:03

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Pacman with db support?

no kidding. Andrew, of samba fame, knows his shizznit..he wrote tdb.
D. J. Bernstein, of qmail fame, wrote cdb.  Qmail uses cdb.


tdb
http://samba.org/~tridge/
http://sourceforge.net/projects/tdb/

cdb
http://cr.yp.to/cdb.html

I think tdb, and maybe cdb, are the best db solution presented so far.

here is some benchmarking done by the qdbm guys
http://qdbm.sourceforge.net/
scroll down to the brother's section, read, and click on the pdf..


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#84 2005-03-29 22:20:33

berkus
Member
From: Tallinn, Estonia
Registered: 2005-03-29
Posts: 65
Website

Re: Pacman with db support?

btw, i would like to hear about constraints you are posing on the backend storage and retrieval

1) how many indexes will there be?
  a) one, indexing on a compound PACKAGE+VERSION string
  b) two, indexing on PACKAGE and VERSION fields
  c) multiple, indexing on arbitrary set of package attributes for fast searching

2) what is the main usage pattern for the storage?
i haven't looked into pacman myself, so i can only guess but i believe it needs to
   a) enumerate all available packages to see which ones need updating/removing
   b) search for a particular package based on a criteria
     b1) given that most interesting thing for a package manager is resolving dependencies, the most used search pattern would be like (PACKAGE='somename' AND (VERSION BETWEEN 'somesmallervalue' AND 'somelargervalue'))


correct me if i'm wrong (i'm collecting this to offer more reasonable solution than just picking db names)


keep in touch.

Offline

#85 2005-03-29 22:33:06

berkus
Member
From: Tallinn, Estonia
Registered: 2005-03-29
Posts: 65
Website

Re: Pacman with db support?

cactus wrote:

here is some benchmarking done by the qdbm guys
http://qdbm.sourceforge.net/
scroll down to the brother's section, read, and click on the pdf..

while qdbm might look very well balanced, its api is rather horrid, so i'd stick to cdb if you can allow pacman to use separate run for updating the cdb file (either from cron or manually) and tdb if you want to allow pacman to write back into tdb file itself (even from multiple concurrent processes, at least so far i understood their claim).

if the answer to the question 1) above is b) or c), cdb/tdb will have to be extended to maintain more indices than they do now (i worked with cdb a bit and i think it would be not hard)


keep in touch.

Offline

#86 2005-03-29 22:35:54

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Pacman with db support?

just had the link because of the benchmark they did...


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#87 2005-03-29 22:41:14

berkus
Member
From: Tallinn, Estonia
Registered: 2005-03-29
Posts: 65
Website

Re: Pacman with db support?

cactus wrote:

just had the link because of the benchmark they did...

better answer the two questions above wink


keep in touch.

Offline

#88 2005-03-29 22:44:33

c0ldevil
Member
Registered: 2004-10-10
Posts: 22

Re: Pacman with db support?

I agree with using very simple databases like cdb/tdb/metakit. I've checked them all and although they seem to fit the part, they haven't been updated recently. Last cdb version seems to be from the year 2000, tdb's from 2001 and metakit's latest version is from early 2004. Metakit seemed to be in heavy development until January 2004 when version 2.4.9.3 was released, while the bug tracking system seems to maintain normal activity. I couldn't find a working tdb's homepage, excluding it's project page at Sourceforge. Not only isn't cdb updated in a long time but the programs which use cdb and are refered to in it's webpage aren't updated also since 2001 or even 1998, which makes me think if it really has stopped being developed since it hasn't even reached version 1.0 (which might not mean anything). :?

This doesn't say anything about the databases but it sure makes you think about their support. If any of this projects is found to be ideal but lacks proper developer support like security/bug fixes or any other updates needed to make it a stable database, its development could be assured by Arch developers if their license permits (tdb - GPL, metakit - MIT, cdb ?).

Offline

#89 2005-03-29 22:47:54

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Pacman with db support?

1)
I imagine there would be one index, on package name. Then a simple text string containing parseable elements as the data associated with the name (like an array attached to a hash element).

2)
I would think reading would be the predominant activity. Updates (write) only on sync operation.


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#90 2005-03-29 22:56:12

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: Pacman with db support?

cdb wins in my book, check out the benchmarks
and the api isn't bad, it's like 10 functions....

about the normal data usage, I think modeling the data after the usage is a bad idea - it tends to produce non-normalized layouts.  The best thing to do would be layout the data first, and extend from there.  The simplest layout would be:

table <packages> (has all 1-to-1 fields, plus a "installed" flag)
table <files> files in the package, joined to packages
table <depends> same
...blah... blah

the data would be normalized then.  The common usage would be very simple...

-Ql "string" : select f.* from packages p, files f where p.id = f.id and p.name = 'string'

-Ss "string" : select * from packages where name like '%string%' or description like '%string%'

-Sy would import new versions into the DB... assuming the "pkgrel" is split to its own column, there will never be a conflict... checking for updates would be rather simple then:
select p2.name from packages p1, packages p2 where p1.name = p2.name and ( p1.version != p2.version or p1.release != p2.release)
and then delete the old rows....

Offline

#91 2005-03-29 23:05:24

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Pacman with db support?

phrak, you are thinking in terms of a relational database, not a hash table.

here is an example of cdb in action:
http://pilcrow.madison.wi.us/python-cdb … 32.Example

cdb[pkgname]="%ver=1.1.3%info=this is a blah blah blah blah package%depends=package1,package2%"

likely it would make more sense than my above example, but I would imagine something similar would be easiest. Just a hash element, with a parseable string data.


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#92 2005-03-29 23:38:10

c0ldevil
Member
Registered: 2004-10-10
Posts: 22

Re: Pacman with db support?

I think that a hash element with an array where information is indexed makes more sense than just a string. 0 could be the name, 1 the version, 2 the info, etc... Howhever, there is the -Ss problem... how about if you want to search for a specific file?

Offline

#93 2005-03-29 23:48:28

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: Pacman with db support?

cactus wrote:

phrak, you are thinking in terms of a relational database, not a hash table.

ack, I was under the impression the interface supported the notion of "tables"... I didn't know it was just a hash lookup... it'd still be better, but that makes me think less of it...

Offline

#94 2005-03-29 23:50:42

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Pacman with db support?

yeah...creates new issues doesn't it..
and I suggested a string to make it as avaliable to other languages as possible. An array in one language might not be the same as an array in another language (at least storage wise).


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#95 2005-03-29 23:59:38

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: Pacman with db support?

i guess you can always implement the table idea with multiple files... package-files.cdb, packages.cdb, package-depends.cdb - and use the same keys between them... that wouldn't be hard...

I agree with cactus... string data would be the best... especially considering it's almost all string data anyway (even the version is a string, so you can have "rc1" and things like that)...
I'd vote for a more verbose delimiter than a percent... like something like "]]><[[" (I just wanted to make some pseudo-ascii art)... it'd just be more ideal to use something odd and unique, that could never be used in anything textual in a package(a percent may show up in URLs if not decoded)

Offline

#96 2005-03-30 00:12:37

cactus
Taco Eater
From: t͈̫̹ͨa͖͕͎̱͈ͨ͆ć̥̖̝o̫̫̼s͈̭̱̞͍̃!̰
Registered: 2004-05-25
Posts: 4,622
Website

Re: Pacman with db support?

true true..
maybe even a nonprinting ascii value..something out of the range of normal chars...Like the "beep" ascii value (aka "bell").
roll

then we can claim that pacman now includes bells, if not whistles.
lol


"Be conservative in what you send; be liberal in what you accept." -- Postel's Law
"tacos" -- Cactus' Law
"t̥͍͎̪̪͗a̴̻̩͈͚ͨc̠o̩̙͈ͫͅs͙͎̙͊ ͔͇̫̜t͎̳̀a̜̞̗ͩc̗͍͚o̲̯̿s̖̣̤̙͌ ̖̜̈ț̰̫͓ạ̪͖̳c̲͎͕̰̯̃̈o͉ͅs̪ͪ ̜̻̖̜͕" -- -̖͚̫̙̓-̺̠͇ͤ̃ ̜̪̜ͯZ͔̗̭̞ͪA̝͈̙͖̩L͉̠̺͓G̙̞̦͖O̳̗͍

Offline

#97 2005-03-30 00:16:03

c0ldevil
Member
Registered: 2004-10-10
Posts: 22

Re: Pacman with db support?

Yeah, string data is more easily accessible by many different languages, that's true.

What I think is most important right now is to look at the whole picture carefully and set basic requirements  and goals that the target system should comply with and go from there to define the database structure.

The multiple files idea isn't bad either..would you store the same info (same parseable string) for each package in each file, or only the info important to each category?

Offline

#98 2005-03-30 00:24:32

phrakture
Arch Overlord
From: behind you
Registered: 2003-10-29
Posts: 7,879
Website

Re: Pacman with db support?

c0ldevil wrote:

The multiple files idea isn't bad either..would you store the same info (same parseable string) for each package in each file, or only the info important to each category?

no, I was saying it like this (pseudo code):

hashkey = "my_package_name"
pacdb = open("packages.cdb")
//this will have all the description, version, etc
print hashkey, "is:n", pacdb[hashkey].replace("]]><[[","n")

filesdb = open("package-files.cdb")
for f in filesdb[hashkey]:
   print hashkey, "contains file", f

dependsdb = open("package-depends.cdb")
for d in dependsdb [hashkey]:
   print hashkey, "depends on", d

... so on and so on...

Offline

#99 2005-03-30 00:39:09

c0ldevil
Member
Registered: 2004-10-10
Posts: 22

Re: Pacman with db support?

Hum, it's pretty simple..at first sight it seems to work pretty well! And if you wanted to do a -Ss it's as simple as reading packages.cdb.. Of course it could be a more optimised solution but this is simple, seems pretty performant and enables easy information fetching.

I guess I'll think about other possible solutions to the problem, in the mean time

Offline

#100 2005-03-30 09:42:22

berkus
Member
From: Tallinn, Estonia
Registered: 2005-03-29
Posts: 65
Website

Re: Pacman with db support?

cactus wrote:

1)
I imagine there would be one index, on package name. Then a simple text string containing parseable elements as the data associated with the name (like an array attached to a hash element).

2)
I would think reading would be the predominant activity. Updates (write) only on sync operation.

Then cdb floats the boat.

(About the "unmaintained" question someone raised here - cdb is mature enough and not being updated for a long time means it didn't take the "creeping featuritis" route.)


keep in touch.

Offline

Board footer

Powered by FluxBB