Hello there,
First of all, I am new to FreeBSD, I have so far only some rather limited experience with Linux though (mostly Debian).
I have a FreeBSD (version 8.1-RELEASE-p11) server running as a Samba and LDAP server.
Just for your information, I am the one who is responsible for managing it (among several other FreeBSD servers) since I replaced the sysadmin who just left the company. He didn't really have the time to explain me how the servers were configured.
Today in the morning, a user complained that he couldn't save a file on the Samba share because he got an error message saying it was full. So I connected to the server using SSH and rapidly figured out that the ZFS quota was reached for this specific share. I was able to increase its quota size which solved the aforementioned issue.
I left my terminal open in the background because I used the find command to search for some files. When I wanted to get the terminal back, I noticed that the session had time out. So I retried to connect, but to no avail. Pings to the IP address of the server also failed. Then I headed to the server room, connected a monitor to the relevant server (physical - no vm or jail) and I saw an error mentioning a kernel panic (something along those lines) and that the system was supposed to automatically reboot in like 10 or 15 sec. After a few minutes, no progression, so I decided to hard reset the server.
Then it rebooted, from what I saw there was nothing wrong or suspicious during the boot process. I logged in locally as root just to make sure nothing obvious was broken, and finally exited the root session.
After some time, the same user told me that he wasn't able to open an .ods (Open Office Calc) file from the samba share after getting a message asking to choose various settings in order to import the file! Rapidly, it was clear that the file was corrupted, opening it in a text editor just displayed loads of repeated "NULL" strings, and that was it. Files on the share were not backed up, otherwise I wouldn't be here .
Soon after, the person told me that there were other files like this (maybe 10 or 12, I don't know exactly) that were basically left unreadable. Their respective size was not 0 bits, and I think it was actually what it should have been before they become corrupted. Apparently, those were only - and all - the files that have been edited this morning, before all the quota was used up and before I had to hard-reset the server.
Anyway, after attempting different things without success (unzip or extract the .ods files, looking for files starting with the " ~ " character in the local shared directory and so on), I was quite desperate.
Then I thought of the following:
- ZFS is a robust file system, with journalling capabilities and some form of redundancy (that is basically all I know about ZFS ), so if there is a sudden power loss before the pending modification could be committed to the HDD's, the differences between the data in RAM and the data written on the zpool must be logged somewhere. I assumed (maybe wrongly though) that the ZFS should finish the writing jobs just after the system rebooted, but apparently that wasn't the case.
Is there a command to check data integrity (compare the journal and the actual writes performed so far), force pending writes to make it to the disks or perhaps to rebuild the zpool?
- Samba should use a cache to manage the sharing/access of/to files, right? But I don't know where it should be located, since I didn't notice anything regarding this aspect in the smb.conf file. So, maybe it uses the system's cache folder, like /tmp? I browsed it and I don't think I have seen anything related to the corrupted files. Do you know by default where the cache would be?
Is there any way to (hopefully) recover those important files?
Please don't tell me files should be regularly backed up, I know it, and trust me, if it depended only on me, they would be. The thing is, the business has a tight budget, and IT is (was) not a priority. Hopefully that will change soon, that those files be eventually recovered or not.
Thank you for reading and I really hope you will be able to help me.
First of all, I am new to FreeBSD, I have so far only some rather limited experience with Linux though (mostly Debian).
I have a FreeBSD (version 8.1-RELEASE-p11) server running as a Samba and LDAP server.
Just for your information, I am the one who is responsible for managing it (among several other FreeBSD servers) since I replaced the sysadmin who just left the company. He didn't really have the time to explain me how the servers were configured.
Today in the morning, a user complained that he couldn't save a file on the Samba share because he got an error message saying it was full. So I connected to the server using SSH and rapidly figured out that the ZFS quota was reached for this specific share. I was able to increase its quota size which solved the aforementioned issue.
I left my terminal open in the background because I used the find command to search for some files. When I wanted to get the terminal back, I noticed that the session had time out. So I retried to connect, but to no avail. Pings to the IP address of the server also failed. Then I headed to the server room, connected a monitor to the relevant server (physical - no vm or jail) and I saw an error mentioning a kernel panic (something along those lines) and that the system was supposed to automatically reboot in like 10 or 15 sec. After a few minutes, no progression, so I decided to hard reset the server.
Then it rebooted, from what I saw there was nothing wrong or suspicious during the boot process. I logged in locally as root just to make sure nothing obvious was broken, and finally exited the root session.
After some time, the same user told me that he wasn't able to open an .ods (Open Office Calc) file from the samba share after getting a message asking to choose various settings in order to import the file! Rapidly, it was clear that the file was corrupted, opening it in a text editor just displayed loads of repeated "NULL" strings, and that was it. Files on the share were not backed up, otherwise I wouldn't be here .
Soon after, the person told me that there were other files like this (maybe 10 or 12, I don't know exactly) that were basically left unreadable. Their respective size was not 0 bits, and I think it was actually what it should have been before they become corrupted. Apparently, those were only - and all - the files that have been edited this morning, before all the quota was used up and before I had to hard-reset the server.
Anyway, after attempting different things without success (unzip or extract the .ods files, looking for files starting with the " ~ " character in the local shared directory and so on), I was quite desperate.
Then I thought of the following:
- ZFS is a robust file system, with journalling capabilities and some form of redundancy (that is basically all I know about ZFS ), so if there is a sudden power loss before the pending modification could be committed to the HDD's, the differences between the data in RAM and the data written on the zpool must be logged somewhere. I assumed (maybe wrongly though) that the ZFS should finish the writing jobs just after the system rebooted, but apparently that wasn't the case.
Is there a command to check data integrity (compare the journal and the actual writes performed so far), force pending writes to make it to the disks or perhaps to rebuild the zpool?
- Samba should use a cache to manage the sharing/access of/to files, right? But I don't know where it should be located, since I didn't notice anything regarding this aspect in the smb.conf file. So, maybe it uses the system's cache folder, like /tmp? I browsed it and I don't think I have seen anything related to the corrupted files. Do you know by default where the cache would be?
Is there any way to (hopefully) recover those important files?
Please don't tell me files should be regularly backed up, I know it, and trust me, if it depended only on me, they would be. The thing is, the business has a tight budget, and IT is (was) not a priority. Hopefully that will change soon, that those files be eventually recovered or not.
Thank you for reading and I really hope you will be able to help me.