WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: WIKI RECOVERY  (Read 7979 times)

Offline CentralWare

  • Administrator
  • Hero Member
  • *****
  • Posts: 1477
WIKI RECOVERY
« on: October 05, 2022, 06:44:29 AM »
For those who wish to participate in the Wiki Recovery Program, this is the forum board in which to do so!

Enclosed is an attachment of all of the text files that the wiki is comprised of.
The Wiki's foundation is built on "DokuWiki" and the Syntax and Formatting needs to be reviewed to make heads or tails on how to make text bold, italic, etc.  (If you compare the existing text files with the Syntax link to the left, things tend to make more sense.)

When submitting updates:
  • Please indicate in the message body if it's a NEW PAGE or UPDATE
  • Please provide the link to the Way Back Machine used to create or modify the content being posted
  • Attach your TXT file to your post; please do not submit archives/zips as my firewall blocks exe's down to zips/rars/etc.; text files only please.

NOTE: The wiki formally crashed somewhere around April of 2020; the last Way Back archive was on March 3, 2020 at the link above.  There are more than 300 pages of content (not including screen-shots and the likes) about 265 of which need to be physically compared and possibly some that need to be created from scratch.  If you've posted a file and you notice your post is missing/rewritten, it's been replaced with a single line post:
Code: [Select]
path/filename.txt complete.
Over 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Offline mocore

  • Sr. Member
  • ****
  • Posts: 449
  • ~.~
Re: WIKI RECOVERY
« Reply #1 on: October 05, 2022, 07:21:26 AM »


just ftr

 

might https://translate.google.com  be of some use

eg : https://translate.google.com/?sl=auto&tl=en&text=https%3A%2F%2Fweb.archive.org%2Fweb%2F20200303035953%2Fhttp%3A%2F%2Fwiki.tinycorelinux.net%3A80%2F%0A&op=translate
the above link translates the web archive page

into this : https://web-archive-org.translate.goog/web/20200303035953/http://wiki.tinycorelinux.net:80/?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=en-US&_x_tr_pto=wapp

imho slight more recognisable page ,... which appears to take care of
And then cut that part from the html files that belongs to web archive.


 one potentaly more tricky step of automating some kind of diff 





Offline CentralWare

  • Administrator
  • Hero Member
  • *****
  • Posts: 1477
Re: WIKI RECOVERY
« Reply #2 on: October 06, 2022, 12:18:38 AM »
@mocore:
might https://translate.google.com  be of some use
If doing links manually, sure, google filters and un-encodes URLs nicely, but for automation if there's not an API to any of Google's web pages that you have an account for, they go to extremes to obfuscate their source code in their websites; run your translation page as you described - then view --> source of that page; I'd imagine it would be difficult to automate anything with that mess :(

If it's just a matter of making the link, such as your example, be "easier on the eyes", by all means, translate away!!!  I love things that are easy on the eyes!

The goal for the link is merely a reference point; if someone wishes to point out differences between WIKI and WAYBACK but doesn't feel like doing the steps to create a replacement text page, I'm still happy for the link!  As long as I can click on it, it'll take me there! ;)

Quote
one potentaly more tricky step of automating some kind of diff 
PHP's urldecode() function should do the same principle function as Google's Translate page does, IF PHP's in the mix of automating/diff

Thanks!
« Last Edit: October 06, 2022, 12:20:22 AM by CentralWare »
Over 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Offline rhermsen

  • Jr. Member
  • **
  • Posts: 86
Re: WIKI RECOVERY
« Reply #3 on: October 06, 2022, 02:20:35 PM »
Hi CentralWare,

I looked if I could create a list of links of the current Wiki and the latest available Wayback page from before the crash.

I did that for the Overview page in an Excel sheet.
Finding the latest Wayback page did require some mouse-clicks, but from there it should be double to script a compare. Did that for 124 pages.

The current Wiki db your restored must be from between 23-Apr-2019 and 29-Jun-2019.
The oldest Wayback page that I have seen so far with a 'Permission Denied' is from 24-Apr-2020.

Maybe it is possible to create a temp. Wiki page with e.g. a table that can be updated?


Offline patrikg

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 615
Re: WIKI RECOVERY
« Reply #4 on: October 07, 2022, 04:42:55 AM »
Hi rhermsen.

If you download my file, you have all the web-archive files in there, before the updated PHP server event. What I think cause the wiki went down with the front login page. Because the connection commands with the php and the sql is not the same with the new version.

I found a script online that download all the files from web-archive, you just set the start date and end date, and then it starts just to download the original files(you don't need to remove the appended web-archive tags from the web pages).

But need to make some script to convert the web html to docuwiki format.
I don't know if you can import some html to docuwiki format and the docuwiki convert it self.

But not all files in my wiki.7z file is correct I think, because the wiki is hit by some XSS hacks.
I think i with some programming knowledge someone can do some script to use the file utilitiy to rename delete and maybe do some md5sum and check if there some copies of files. And use the "file" utility so determine what the file is and what extension i should have.

Here's the tread with the link to the file:
https://forum.tinycorelinux.net/index.php/topic,25947.msg166517.html#msg166517

Offline rhermsen

  • Jr. Member
  • **
  • Posts: 86
Re: WIKI RECOVERY
« Reply #5 on: October 07, 2022, 12:51:52 PM »
Hi Patrikg,

I'm less than an hour of mouse-clicks away from having al list of all latest pages that are available in the web-archive.
I used the 'sitemap' page which as 318 links to pages.

With the list it should be possible to get the latest available page via 'wget $page -O $flile' and a 'cat $file | grep modified' to get the latest modification date.
Every page that was modified e.g. after 2019-04 can than be checked (got the impression that about 10 or so pages that where changed after 2019-04).

I found two pages in the web-archive that are not available in the current Wiki.

http://web.archive.org/web/20200209095435/http://wiki.tinycorelinux.net/picore:installation
http://web.archive.org/web/20200209191229/http://wiki.tinycorelinux.net/picore:pi_zero_w_wifi

Can you verify if you have these pages in your file?
These are also missing as link from the Overview https://wiki.tinycorelinux.net/doku.php?id=wiki:start
See: http://web.archive.org/web/20200209000311/http://wiki.tinycorelinux.net/wiki:start

Offline rhermsen

  • Jr. Member
  • **
  • Posts: 86
Re: WIKI RECOVERY
« Reply #6 on: October 07, 2022, 04:56:16 PM »
  • Found 9 pages that some updates, and two pages that are missing. See tables below
  • Most changes are minor textual changes, except the setting_up_wifi.txt page which had a bit of a makeover for the first part of the page.
  • (ignore my earlier remark regarding opening and closing double-quotes, that looks a difference between the two sites)


Update
Current WikiWay-back Machine
https://wiki.tinycorelinux.net/doku.php?id=dcore:locale_in_ub-dcorehttp://web.archive.org/web/20200131055631/http://wiki.tinycorelinux.net/dcore:locale_in_ub-dcore
https://wiki.tinycorelinux.net/doku.php?id=dcore:server_applicationshttp://web.archive.org/web/20191229045821/http://wiki.tinycorelinux.net/dcore:server_applications
https://wiki.tinycorelinux.net/doku.php?id=dcore:welcomehttp://web.archive.org/web/20200303053236/http://wiki.tinycorelinux.net/dcore:welcome
https://wiki.tinycorelinux.net/doku.php?id=wiki:creating_extensionshttp://web.archive.org/web/20200208152109/http://wiki.tinycorelinux.net/wiki:creating_extensions
https://wiki.tinycorelinux.net/doku.php?id=wiki:mirrorshttp://web.archive.org/web/20200212194300/http://wiki.tinycorelinux.net/wiki:mirrors
https://wiki.tinycorelinux.net/doku.php?id=wiki:mplayer-nodepshttp://web.archive.org/web/20200130110602/http://wiki.tinycorelinux.net/wiki:mplayer-nodeps
https://wiki.tinycorelinux.net/doku.php?id=wiki:remastering_with_ezremasterhttp://web.archive.org/web/20200224015037/http://wiki.tinycorelinux.net/wiki:remastering_with_ezremaster
https://wiki.tinycorelinux.net/doku.php?id=wiki:setting_up_wifihttp://web.archive.org/web/20200128173346/http://wiki.tinycorelinux.net/wiki:setting_up_wifi
https://wiki.tinycorelinux.net/doku.php?id=wiki:starthttp://web.archive.org/web/20200209000311/http://wiki.tinycorelinux.net/wiki:start

New Page
http://web.archive.org/web/20200209095435/http://wiki.tinycorelinux.net/picore:installation
http://web.archive.org/web/20200209191229/http://wiki.tinycorelinux.net/picore:pi_zero_w_wifi


UPDATES COMPLETE!
« Last Edit: October 08, 2022, 09:28:17 PM by CentralWare »

Offline CentralWare

  • Administrator
  • Hero Member
  • *****
  • Posts: 1477
Re: WIKI RECOVERY
« Reply #7 on: October 08, 2022, 06:26:26 AM »
http://web.archive.org/web/20200209095435/http://wiki.tinycorelinux.net/picore:installation
http://web.archive.org/web/20200209191229/http://wiki.tinycorelinux.net/picore:pi_zero_w_wifi
These are verified missing from the old hosting image.
Quote
These are also missing as link from the Overview...
The start/welcome pages are assumed as well (any time there are missing pages, they will usually seed from start.)

However, the following link (dates) doesn't make sense compared to the crash date:
http://web.archive.org/web/20190922191859/http://wiki.tinycorelinux.net/?do=recent
...nor does the Site Index, as it says it was last updated in 2017??

Thank you both for your tireless efforts!
We still have to strip HTML tags, implement DukoWiki tags (etc etc etc) BUT if there's only 11 or so pages that need work out of 300+...  I could re-write 'em from scratch and still be happy about it!

« Last Edit: October 08, 2022, 06:32:15 AM by CentralWare »
Over 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Offline rhermsen

  • Jr. Member
  • **
  • Posts: 86
Re: WIKI RECOVERY
« Reply #8 on: October 08, 2022, 07:14:31 AM »
Regarding the link http://web.archive.org/web/20190922191859/http://wiki.tinycorelinux.net/?do=recent
The info does match with what I have obtained it looks, specifically the last 4 from April-2019 I verified with a text compare, and the current Wiki is the same.
So looks the 'framework' was last updated on 2017/09/25, and the server keeps the page current.

For the 8 of the 9 pages I suspect it is even less work, just a handful of lines to change with little to no formatting.

There are tools that can obtain text from web-pages like
Code: [Select]
lynx -dump -nolist and
Code: [Select]
elinks [-dump] and
Code: [Select]
w3m -dump website.html [1], but these are not yet available in TCL. But than again I did that by opening the page, copy the text, past in NP++ to have the text without formatting. That looks to me the easiest way for the two missing pages.


[1] https://unix.stackexchange.com/questions/42636/how-to-get-text-of-a-page-using-wget-without-html

Offline patrikg

  • Wiki Author
  • Hero Member
  • *****
  • Posts: 615
Re: WIKI RECOVERY
« Reply #9 on: October 08, 2022, 07:47:36 AM »
If there are some missing files for pictures, i think you can use my file to add the files to docuwiki.


Offline CentralWare

  • Administrator
  • Hero Member
  • *****
  • Posts: 1477
Re: WIKI RECOVERY
« Reply #10 on: October 08, 2022, 08:06:33 AM »
...but these are not yet available in TCL.
".top" is the sandbox for any changes being made before they're taken live
Click and scroll to the bottom
"Revision Needed" I've added as the content is...  well, a little dated in some areas.  (it was written when it was EASY to get your hands on a RasPi!! :) )
Over 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Offline CNK

  • Full Member
  • ***
  • Posts: 202
Re: WIKI RECOVERY
« Reply #11 on: October 08, 2022, 09:11:53 PM »
But need to make some script to convert the web html to docuwiki format.
I don't know if you can import some html to docuwiki format and the docuwiki convert it self.

There's Pandoc, which claims to be able to convert DocuWiki format to/from HTML.

This Perl script converts HTML to DocuWiki format.

There's also a similar Python script, which sounds like it's a little unfinished.

More here, including online converter webpages.

It looks like this advise is probably too late, but free advice often is. For next time, maybe?

Offline CentralWare

  • Administrator
  • Hero Member
  • *****
  • Posts: 1477
Re: WIKI RECOVERY
« Reply #12 on: October 08, 2022, 09:38:22 PM »
Good evening, and many thanks to everyone who has been involved!

After a great deal of typing, formatting, etc. the "known" pages from above have been updated (typos I found corrected, a little punctuation here and there, and a bit of re-formatting to make it a little easier on the eyes where necessary) BUT, the content has been sent over to .net for your approval!  All that should be left that's wiki-related is administration once the Forum is beat up on a little more.

In my opinion (and how we normally set our clients up) is the WIKI is a read-only board of knowledge for all to see where the administrators tend to its upkeep and to adding content.  Users of the wiki generally head over to the forum (dedicated links within Wiki, usually) if they have comments, suggestions, edits, etc. to offer, thus wiki always stays clean (and not prone to bots!)
Please offer your thoughts and opinions on this!

Thanks!

T.J.
Over 90% of all computer problems can be traced back to the interface between the keyboard and the chair

Offline CNK

  • Full Member
  • ***
  • Posts: 202
Re: WIKI RECOVERY
« Reply #13 on: October 08, 2022, 10:57:26 PM »
In my opinion (and how we normally set our clients up) is the WIKI is a read-only board of knowledge for all to see where the administrators tend to its upkeep and to adding content.  Users of the wiki generally head over to the forum (dedicated links within Wiki, usually) if they have comments, suggestions, edits, etc. to offer, thus wiki always stays clean (and not prone to bots!)
Please offer your thoughts and opinions on this!

I'd like to see the wiki used instead of the forums for posting most how-to type posts because I find those very difficult to discover via the forum - they're hard to find amongst all the usual "help me" type threads. The "TCE Tips & Tricks" sub-forum does help though.

To that end I'd prefer if it were openly editable by all users. But I'm not volunteering to continually check edits for spam myself, so I may be asking too much to assume that others will keep that in check.

Offline gadget42

  • Hero Member
  • *****
  • Posts: 558
Re: WIKI RECOVERY
« Reply #14 on: October 09, 2022, 12:07:41 AM »
re: wiki editing

using the member list as a reference point, it appears that the break-point between "Full Member" and "Sr. Member" is 250 posts.

as of 20221009 there are approximately 75 members who have above 250 posts(although many are-not/have-not-been active recently).

is there a way to give wiki editing permission to a subset of forum members("Sr. Member" and above)?

IMHO, "everyone" should NOT be able to edit the wiki.

sharing is caring
The fluctuation theorem has long been known for a sudden switch of the Hamiltonian of a classical system Z54 . For a quantum system with a Hamiltonian changing from... https://forum.tinycorelinux.net/index.php/topic,25972.msg166580.html#msg166580