Tiny Core Linux
Tiny Core Base => Raspberry Pi => Topic started by: onelife on January 06, 2014, 01:41:13 AM
-
Hi forum,
I've looked at the filetool.sh help etc and the boot options on TC and I am wondering how one actually takes advantage of the "mydatabk.tgz" without having to manually move / rename anything?
IOW, how do I modify the boot process to actually "fall back" to using the "mydatabk.tgz" file if the "mydata.tgz" file extraction fails at any point?
I'm maybe missing something but surely the extraction process has an exit code and / or a know status on boot and if that fails, the boot process could then cycle and try boot using the mydatabk.tgz?
I'm just trying to build an as solid as possible system :)
Any ideas welcome - Thank You!
-
Anybody have any ideas? Seems weird that a "backup" should exist but there's no automatic boot process to "recover" to this backup if boot fails.
Hope someone has some suggestions :)
Thanks!
-
Hmm, might make sense to automatically fallback to mydatabk.tgz for restoring only.
You can specify which backup file to use via bootcode:
mydata=mydatabk
But this would also overwrite the current mydatabk.tgz file if doing another backup.
You might take a look at /etc/init.d/tc-config and /usr/bin/filetool.sh , maybe you can tweak it to your needs.
-
Boot code "norestore" ??
Sent from my iPhone using Tapatalk
-
Nah, it's not about getting the system to boot without restoring the backup, but to actually make use of the backup's backup (mydatabk.tgz) which can be created but seems to be unused.
I agree with the OP that, if restoring from the backup file fails, it might be attempted to restore from ...bk.tgz instead (without specifying that file as the new backup file, though). Maybe after user confirmation.
-
You can always boot with 'norestore', copy the backup to current and reboot.
-
Hi all,
tanks for these ideas. The object for my side is the "restore" needs to be 100% automated without any intervention.
So somehow a way that say if loading mydata fails, it should then reboot trying to use mydatabk.
However that happens be it threw file renaming etc I don't mind .. but need to try ensure it's 100% automated.
does anyone know if there's exit codes for the filetool.sh process?
Big thanks!
-
In curious, how would tc know it failed to boot? And for what reason?
Sent from my iPhone using Tapatalk
-
I think this is exactly what I'm trying to get working. Essentially I would think within core itself, it should have a "check" that says if it fails to load the mydata.tgz file and settings, the main kernel should know to try reboot with the mydatabk.tgz.
At the moment, as you say, there's no way I can set a "flag" or similar to tell it to reboot with the mydatabk file. IOW, if I put a "check" into my own script etc, that too is part of the mydata and if not loaded, the check wouldn't work.
Perhaps a boot code could be introduced that would set this type of "flag" to auto reboot with mydatabk if mydata "fails" to extract.
So in my way of thinking, the main OS itself would know that if mydata "fails" it should try mydatabk.
Your thoughts on this?
BIG Thanks!
-
First I would evaluate when and why mydata.tgz restore would fail. Is it really a a case we have to be prepared for? I have never seen a single case when it happened. What can go wrong?
- It can get corrupted, the file itself. Very low probability on a journaling file system. If it happenes, probably the system is corrupted, not the backup file. Anyhow, you can create an md5 file and check is it valid or not before extracting.
What can happen is that no free RAM to extract, but it would be the same with maydatabk.tgz too. It must be guaranteed at creation time to have a valid backup file.
-
Hi Bela,
Thanks for this - The case is when the power is removed during the "filetool.sh" process.
I know this is very rare and unlikely but if the filetool.sh process is "stopped" mid way, the created mydata.tgz file is then "corrupt"
So, yip, I'm keen on any way I can make sure if the "filetool.sh" process doesn't complete 100%, it would perhaps even try again? A thought?
You can "reproduce" this situation by running filetool.sh -vb and then "ctrl c" to stop it mid way. Then try reboot and that's the type of "corrupt" mydata.tgz I'm trying to avoid.
Big thanks again!
-
So, yip, I'm keen on any way I can make sure if the "filetool.sh" process doesn't complete 100%, it would perhaps even try again? A thought?
How to try it again if device is unpowered?
What is about using 'Safe backup' mode?
-
If you follow the steps :
1) power up TC
2) run filetool.sh -vb and interrupt the process with ctrl c half way threw.
3) reboot the system
4) TC will not load mydata.tgz and thus fail to load any settings that would enable me to remotely connect to the system.
This will produce the issue that basically shows how you can't restore a "corrupt" / incomplete mydata.tgz and it's then "stuck".
I'm already using "safe backup" as that's the whole idea - I'm trying now to automatically benefit from previously created mydatabk.tgz.
IOW, if the mydata.tgz was not created 100% and thus can't actually be extracted, the system should "fall back" to using the mydatabk.tgz.
Does this help explain :) ?
-
Hi onelife
After running a backup generate a MD5 file of it. If the backup doesn't complete. the MD5 file won't match on the next reboot. Modify
the bootsync.sh file to check if the MD5 file matches.
If it doesn't, copy the backup file and MD5 over the originals and reboot (use cp and not mv).
If it does, copy the current backup and MD5 over the previous version.
You will need to remaster to modify the bootsync.sh file since these changes can't be part of the backup for obvious reasons.
You should also add a test to see if both backup files exist. If they don't, allow the boot to complete so they can be created.
-
Hi Rich,
Thanks for that - Sounds fab! but please excuse my possibly silly question, but this would mean I need to add the md5 check element to the filetool.sh script right? I think this must be the case as otherwise, there's still no way of knowing if the filetool.sh process was completed 100%? Or am I missing something. IOW, you can't md5 checksum a backup that you don't know for 100% sure completed successfully.
*** updated thought ***
After looking at the filetool.sh script - Another maybe silly thought, surely if the backup process was run to create say for example "mydata_new.tgz" file and then only AFTER the tar has completed, do the "mv -f" to move the current mydata.tgz > mydatabk.tgz and THEN mydata_new.tgz > mydata.tgz. - Just a thought as at the moment I see the "mv -f" happens before the new backup has actually been taken and I guess that's where the "danger window" occurs.
Thanks again for your help - Much appreciated!
-
Hi onelife
First thing to remember, there is no such thing as failsafe. All you can do is try to make a system as robust as possible and minimize
the windows of opportunity for a failure.
... but this would mean I need to add the md5 check element to the filetool.sh script right?
Not necessarily. You could call the filetool.sh script from the shutdown script and then generate the MD5 from there.
... there's still no way of knowing if the filetool.sh process was completed 100%?
The script outputs the word failed if it fails.
-
hi onelife,
filetool.sh writes 2 files to /tmp, backup_done and backup_status. You need to monitor both of these.
I have found couple of things that may be of interest.
1. It is possible to have a backup error in backup_status but have an earlier backup_done still present. If I am using a script I delete both files before a backup.
2. The only backup errors I have ever had is a file in filetool.lst that no longer existing or a typo in filetool.lst.
3. If you are using the command line, alternating between using "sudo filetool.sh -b" and "backup" commands may result in the a permission error on backup_done and backup_status files.
regards
-
Hi all,
Not sure if this video will help explain in anyway the "issue".
www.shopbeat.co.za/picore_failed_mydata.swf
Essentially my steps were :
1) run filetool.sh and mid way pull the power OR do ctrl c.
2) Repower or reboot the system
You will see the mydata.tgz that was not completed will extract what it has, but essentially it won't load all settings as any files missed from beyond the power outage / ctrl c point will not be there.
So as I say, I understand that it can't be 100% failsafe, but I do think there's a bigger window of error in the way the filetool.sh process doesn't only move the mydata.tgz file > mydatabk.tgz AFTER creating a valid backup. Perhaps the move from mydata.tgz > mydatabk.tgz should in fact only happen after the new backup has been "sudo" tested?
Again, please don't think I'm being difficult, I'm just keen to try make our system as solid as possible as here in Africa power outages are a very real thing.
FYI, I have about 150 TC systems running and so far this has only happened once but I'm sure eventually it will happen again ;)
Chat soon - Many Thanks!
-
Your suggestion will require more free space in the tce directory because now you need room for 3 copies of the backup.
This introduces another possible cause of failure.
-
Hi onelife,
You are implying that the data on the SD card is very important. I would not trust a "single" media solution for any critical data, especially a SD card. I would setup one Raspberry Pi as a server (with USD hard drive and backup procedure) and make a second backup there.
regards