You are not logged in.
Where I work we often use the boot disk's UUID or the first network card's MAC address to identify a machine. That's not perfect, they can be played with too, but they are typically more stable than a machine's IP address.
Is the data anonymized before it is sent? Or only once it arrives on the server? Is it sent over a secure connection? Just some questions that come to mind, I'll try to look at the code later.
Headed for the second star to the right and straight on 'til morning...
Schultzter
Offline
So it runs /etc/cron.weekly/pkgstats. Does that mean it runs Sunday at 00:00? What happens if my machine is switched off then (laptop)?
Offline
@ maggie, Anacron?
Basically it depends on which cron implementation you are using. But most usually have mechanisms in place to run *missed jobs*.
Offline
@schultzter you miss some important points which was mentioned by @Pierre
Not at all.
See:
1) I don't want track individual users for privacy reasons.
2) Everything that is sent by pkgstats can easily manipulated by the user without us noticing. So any idea based on sending data is flawed.
3) The IP hash is only used to prevent too easy flooding; not to track users or make the stats any more accurate.
4) There is no way to get exact values, but over time if more and more people use pkgstats some single variations (e.g. when someone sends garbage) wont matter.
Offline
He glider, Welcome to Arch Linux.
Be aware that you just responded to a post that is coming up on three years old.
Nothing is too wonderful to be true, if it be consistent with the laws of nature -- Michael Faraday
Sometimes it is the people no one can imagine anything of who do the things no one can imagine. -- Alan Turing
---
How to Ask Questions the Smart Way
Offline
Sorry to necropost, but this looks like the right place to ask.
Is there anywhere to get the raw data from pkgstats? The web page is a bit hard to analyze.
Offline
Sorry to necropost, but this looks like the right place to ask.
Is there anywhere to get the raw data from pkgstats? The web page is a bit hard to analyze.
Yes!!! I recently made a project which scrapes the site and outputs the result as a JSON file. (Have updated the wiki at: https://wiki.archlinux.org/index.php/pkgstats with a link to the project.)
You will need to install Haskell, alternatively I've got the JSON file hosted here: http://trycatchchris.co.uk/files/packagestatistics.json. I could look into getting this dockerized, or even hosted and updated accordingly.
Do you have any project in mind for the data?
Last edited by psycho_tea_drinker (2016-11-28 09:01:29)
Offline
I did a little polish of the stats page. I might extract it into its own service some day. There is a JSON export now:
* https://www.archlinux.de/statistics
* https://www.archlinux.de/statistics/package
* https://www.archlinux.de/statistics/package.json
* https://www.archlinux.de/statistics/module.json
Expect some things to change though.
Offline
I did a little polish of the stats page. I might extract it into its own service some day. There is a JSON export now:
* https://www.archlinux.de/statistics/
This page is 404.
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline
Yep, fixed that. Remove the trailing /.
Offline
@Pierre
With the "package.json" file:
1) How do we convert the 'count' to a percentage? Is there a maximum value somewhere, or do we just use 'filesystem' (count = 29901) as the reference point?
2) The file is about 2.1 MiB in size. Can you also provide a "package.json.tar.bz2"? This would get it down to about 308 KiB.
Last edited by Xavion (2017-09-01 03:12:33)
Offline
It's already gzip'ed and about 380KByte. But this is just a complete dump. Probably not that useful. I plan to decouple this API from the rest of the website and start improving on it. E.g. it would be great to asko for stats of a single package.
I still need to figure out what is going on with the numbers. Using some package as base sounds fine to me for now.
Offline
It's already gzip'ed and about 380KByte.
What's the URL? I want to start using it right away.
But this is just a complete dump. Probably not that useful.
How often is the dump updated (e.g. daily or weekly)?
Offline
archlinux32.org project also offers pkgstats package and seems to report it's summary also to https://pkgstats.archlinux.de/post
The data can be distinguish by arch="$(uname -m)" - is there a way on https://pkgstats.archlinux.de/package to filter for i686 usage ?
Or should archlinux32.org project don't provide the current pkgstats package and implement a fork with another post server ?
Thanks.
Offline
is there a way on https://pkgstats.archlinux.de/package to filter for i686 usage?
What do you mean filter for it? In old data maybe, but in current data: yes all i686 is already filtered out!
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
filtered = discarded - i686 statistics are now not available anymore ?
I rewrite my question: does it make sense to use and run pkgstats package (same as on archlinux.org) in archlinux32 ?
Offline
I suppose it does collect the architecture along with everything else, and right now it does still have i686 included.
"UNIX is simple and coherent" - Dennis Ritchie; "GNU's Not Unix" - Richard Stallman
Offline
I suppose it does collect the architecture along with everything else, and right now it does still have i686 included.
I'm more concerned that still on archlinux.de and not on archlinux.org and the move to .org was due like since it conception I remember
Well, I suppose that this is somekind of signature, no?
Offline
Hi, I wonder where I can see the sourcecode of the server-side processing? The link in post #2 is dead currently?
Offline
I got "400 bad request" if I run pkgstats on my server. Any idea what's going on?
The output of pkgstats -s: https://cfp.vim-cn.com/cbfmx
The culprit might be the empty modules= line. The server is a Linode VM, which uses the kernel provided by Linode. Seems all modules are linked into the kernel itself -
There's only an empty file modules.dep in /lib/modules/$(uname -r)
`lsmod` shows nothing
By the way, the value for mirror is incorrect due to a change in pacman 5.1. I applied a fix similar to https://lists.archlinux.org/pipermail/a … 04720.html and it works now.
Offline
Nice project. Just checked it out, but pages for package and module statistics https://pkgstats.archlinux.{de,org}/{package,module} don't show any stats. The German fun stats page does though. Is the project still alive?
Offline
Nice project. Just checked it out, but pages for package and module statistics https://pkgstats.archlinux.{de,org}/{package,module} don't show any stats. The German fun stats page does though. Is the project still alive?
https://pkgstats.archlinux.de/{package,module} still works fine! Tested with aur/firefox-beta-bin 66.0rc2-1.
Offline
My bad. Can confirm it's working! ("No script" was blocking.)
Offline
Just an update:
* pkgstats 2.4 no longer collects module data
* I improved the website https://pkgstats.archlinux.de/ and added some documentation
* An initial version of an API is available and documented at https://pkgstats.archlinux.de/api/doc
Offline
I think the fun stats for DE usage may have a bug -- 63% of informant machines have KDE/plasma, but only 34% have kwin 31% have plasma-desktop installed according to the pkg stats.
It's nice to see GNOME getting utterly trounced, but unfortunately I don't think it's accurate.
EDIT: plasma-desktop is probably the better indicator
Last edited by WorMzy (2019-06-22 09:28:56)
Sakura:-
Mobo: MSI MAG X570S TORPEDO MAX // Processor: AMD Ryzen 9 5950X @4.9GHz // GFX: AMD Radeon RX 5700 XT // RAM: 32GB (4x 8GB) Corsair DDR4 (@ 3000MHz) // Storage: 1x 3TB HDD, 6x 1TB SSD, 2x 120GB SSD, 1x 275GB M2 SSD
Making lemonade from lemons since 2015.
Offline