Post #1 · Posted at 2016-12-25 12:25:39am 8 years ago
This forum seems mostly dead, but nevertheless it is the best fit for a project I've been working on.
I wrote a small program to scrape simfile categories from Z-i-V. Basically, you give it a category number and optionally a prefix to filter for, and it downloads the matching simfiles. It unzips the downloaded files in almost all cases (there are some cases where it can't figure out how). If you run it more than once in the same directory, it only downloads new simfiles.
The thing I use it for most is to download the current week from a simfile contest.
I don't see any robots.txt file that asks bots to stay away, but it's still worth asking if this is okay. If there's nothing wrong with using this kind of script, would it be suitable for a larger audience?
Also, is there any API on Z-i-V that returns the categories which are available or the songs available in a category? Right now it's getting the song list from a category by reading the webpage for that category. Similarly, I would like to add the ability to interactively choose which category to download, and that probably involves reading the simfiles homepage. If there's any way to get that information directly from Z-i-V, that would improve the reliability of the script and reduce the load on Z-i-V's servers.
The script is publicly available on github, and I'll be happy to share it if it's kosher.
It seems like there are no objections, so I added a cheap GUI to the script and posted it here:
https://github.com/AngledLuffa/stepmania-tools/tree/master/ziv
There's a text interface in scrape_category.py, and there's a GUI in scrape_interface.py. If there's anything missing from the documentation, please let me know so I can update the files.
I wrote a small program to scrape simfile categories from Z-i-V. Basically, you give it a category number and optionally a prefix to filter for, and it downloads the matching simfiles. It unzips the downloaded files in almost all cases (there are some cases where it can't figure out how). If you run it more than once in the same directory, it only downloads new simfiles.
The thing I use it for most is to download the current week from a simfile contest.
I don't see any robots.txt file that asks bots to stay away, but it's still worth asking if this is okay. If there's nothing wrong with using this kind of script, would it be suitable for a larger audience?
Also, is there any API on Z-i-V that returns the categories which are available or the songs available in a category? Right now it's getting the song list from a category by reading the webpage for that category. Similarly, I would like to add the ability to interactively choose which category to download, and that probably involves reading the simfiles homepage. If there's any way to get that information directly from Z-i-V, that would improve the reliability of the script and reduce the load on Z-i-V's servers.
It seems like there are no objections, so I added a cheap GUI to the script and posted it here:
https://github.com/AngledLuffa/stepmania-tools/tree/master/ziv
There's a text interface in scrape_category.py, and there's a GUI in scrape_interface.py. If there's anything missing from the documentation, please let me know so I can update the files.
Post #2 · Posted at 2016-12-25 12:33:11am 8 years ago
Sigrev2 | |
---|---|
Member+ | |
4,204 Posts | |
Reg. 2009-10-17 | |
"suffering from success" |
>Z-i-V
c'mon, lad
c'mon, lad
Post #3 · Posted at 2016-12-25 12:40:52am 8 years ago
GadgetJax | |
---|---|
Member | |
757 Posts | |
Reg. 2015-09-20 | |
Quote: AnonyWolf
>Z-i-V
c'mon, lad
c'mon, lad
Did somebody mention me?
Post #4 · Posted at 2016-12-25 01:04:11am 8 years ago
Loodee | |
---|---|
Member+ | |
275 Posts | |
Reg. 2014-06-26 | |
this thing's nice, gj
Post #5 · Posted at 2016-12-25 01:27:00am 8 years ago
AngledLuffa | |
---|---|
Member | |
111 Posts | |
Reg. 2015-05-14 | |
Quote: Loodee
this thing's nice, gj
Glad to hear it's useful!
Quote: AnonyWolf
Quote: AngledLuffa
Z-i-V
c'mon, ladFeel free to submit a pull request