Page 1 of 1

Wiki dump

Posted: Tue Jun 02, 2026 8:38 pm
by wanne
I know that it isn't that easy to do dumps with media wiki. So I ask: Am I allowed to do a dump it with httrack myself? Could do it gently with a few hundred kbit/s. Should I provide dumps so that other crawlers do not need to crawl over it again?

Re: Wiki dump

Posted: Wed Jun 03, 2026 1:34 pm
by eugenekay
Special:Export appears to be enabled. The Wiki uses CloudFlare so try not to exceed any Rate Limits.

Good Luck!

Re: Wiki dump

Posted: Wed Jun 03, 2026 7:24 pm
by wanne
I rather extract the MD from html than dealing with the XML.

CloudFlare is there to keep out unwanted humans. Crawlers that do not execute javasrtipt, do not keep cookies and can change their IP do not tend to have that much of a problem.
But it was kind of the reason why I was asking. If it would be a stupid nginx I would not have feared to break anything.

Re: Wiki dump

Posted: Thu Jun 04, 2026 9:43 am
by Sanqui
This is acceptable, just don't overload the server and follow the license (CC BY-NC-SA). :)