Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/bareos/bareos.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorBruno Friedmann <bruno.friedmann@bareos.com>2022-08-17 15:38:33 +0300
committerPhilipp Storz <philipp.storz@bareos.com>2022-08-30 19:54:35 +0300
commitbe837ed02778eefe2d8ce88474319e98006f451b (patch)
tree0a7eacfbfc08bcb38d9c8f350063cdc4dbabec37 /docs
parent25ba49599c7f83fc7e626a45d7f869ec5b29ecda (diff)
docs: Troubleshooting improvements
- convert indexes and codeblock to Sphinx rest syntax. - remove obsolete content (tcpwrapper,media volwrite integer). - refresh examples. Signed-off-by: Bruno Friedmann <bruno.friedmann@bareos.com>
Diffstat (limited to 'docs')
-rw-r--r--docs/manuals/source/Appendix/Troubleshooting.rst204
1 files changed, 98 insertions, 106 deletions
diff --git a/docs/manuals/source/Appendix/Troubleshooting.rst b/docs/manuals/source/Appendix/Troubleshooting.rst
index ea923c7b7..4bc99e974 100644
--- a/docs/manuals/source/Appendix/Troubleshooting.rst
+++ b/docs/manuals/source/Appendix/Troubleshooting.rst
@@ -13,7 +13,10 @@ The Bareos programs contain a lot of debug messages. Normally, these are not pri
Client Access Problems
----------------------
-:index:`\ <single: Problem; Cannot Access a Client>`\ There are several reasons why a |dir| could not contact a client on a different machine. They are:
+.. index::
+ single: Problem; Cannot Access a Client
+
+There are several reasons why a |dir| could not contact a client on a different machine. They are:
- Check if the client file daemon is really running.
@@ -21,8 +24,6 @@ Client Access Problems
- You have a firewall, and it is blocking traffic on port 9102 between the Director’s machine and the Client’s machine (or on port 9103 between the Client and the Storage daemon machines).
-- If your system is using Tcpwrapper (:file:`hosts.allow` or :file:`hosts.deny` file), verify that is permitting access.
-
- Your password or names are not correct in both the Director and the Client machine. Try configuring everything identical to how you run the client on the same machine as the Director, but just change the address. If that works, make the other changes one step at a time until it works.
Some of the DNS and Firewall problems can be circumvented by configuring clients using :ref:`section-ClientInitiatedConnection` or as :ref:`PassiveClient`.
@@ -30,38 +31,40 @@ Some of the DNS and Firewall problems can be circumvented by configuring clients
Difficulties Connecting from the FD to the SD
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-:index:`\ <single: Problem; Connecting from the FD to the SD>`\
+.. index::
+ single: Problem; Connecting from the FD to the SD
If you are having difficulties getting one or more of your File daemons to connect to the Storage daemon, it is most likely because you have not used a fully qualified domain name on the :config:option:`dir/storage/Address`\ directive. That is the resolver on the File daemon’s machine (not on the Director’s) must be able to resolve the name you supply into an IP address. An example of an address that is guaranteed not to work: :strong:`localhost`. An example that
may work: :strong:`bareos-sd1`. An example that is more likely to work: :strong:`bareos-sd1.example.com`.
You can verify how a |fd| resolves a DNS name by the following command:
-::
+.. code-block:: bconsole
+ :caption: Test DNS resolution on the |fd| \name{bareos-fd}
- \begin{bconsole}{Test DNS resolution of the \bareosFd \name{bareos-fd}}
*<input>resolve client=bareos-fd NONEXISTINGHOSTNAME</input>
Connecting to Client bareos-fd at bareos:9102
bareos-fd: Failed to resolve NONEXISTINGHOSTNAME
*<input>resolve client=bareos-fd bareos-sd1.example.com</input>
Connecting to Client bareos-fd at bareos:9102
bareos-fd resolves bareos-sd1.example.com to host[ipv4;192.168.0.1]
- \end{bconsole}
+
If your address is correct, then make sure that no other program is using the port 9103 on the Storage daemon’s machine. The Bacula project has reserved these port numbers by IANA, therefore they should only be used by Bacula and its replacements like Bareos. However, apparently some HP printers do use these port numbers. A :command:`netstat -lntp` on the |sd|’s machine can determine who is listening on the 9103 port (used for FD to SD communications in Bareos).
Authorization Errors
~~~~~~~~~~~~~~~~~~~~
-:index:`\ <single: Problem; Authorization Errors>`\ :index:`\ <single: Concurrent Jobs>`\
+.. index::
+ single: Problem; Authorization Errors
+ single: Concurrent Jobs
.. _AuthorizationErrors:
+For security reasons, Bareos requires that both the |fd| and the |sd| know the name of the |dir| as well as its password. As a consequence, if you change the |dir|’s name or password, you must make the corresponding change in the |fd|’s and in the |sd|’s configuration files.
-For security reasons, Bareos requires that both the File daemon and the Storage daemon know the name of the Director as well as its password. As a consequence, if you change the Director’s name or password, you must make the corresponding change in the Storage daemon’s and in the File daemon’s configuration files.
-
-During the authorization process, the Storage daemon and File daemon also require that the Director authenticates itself, so both ends require the other to have the correct name and password.
+During the authorization process, the |fd| and |sd| also require that the |dir| authenticates itself, so both ends require the other to have the correct name and password.
If you have edited the configuration files and modified any name or any password, and you are getting authentication errors, then your best bet is to go back to the original configuration files generated by the Bareos installation process. Make only the absolutely necessary modifications to these files – e.g. add the correct email address. Then follow the instructions in the :ref:`Running Bareos <TutorialChapter>` chapter of this manual. You will run a backup to disk and a restore.
Only when that works, should you begin customization of the configuration files.
@@ -74,12 +77,11 @@ Here is a picture that indicates what names/passwords in which files/Resources m
:width: 80.0%
-
-
-In the left column, you will find the Director, Storage, and Client resources, with their names and passwords – these are all in the |dir| configuration. The right column is where the corresponding values should be found in the Console, Storage daemon (SD), and File daemon (FD) configuration files.
+In the left column, you will find the |dir|, |sd|, and |fd| resources, with their names and passwords – these are all in the |dir| configuration. The right column is where the corresponding values should be found in the Console, |sd| (SD), and |fd| (FD) configuration files.
Another thing to check is to ensure that the Bareos component you are trying to access has :strong:`Maximum Concurrent Jobs`\ set large enough to handle each of the Jobs and the Console that want to connect simultaneously. Once the maximum connections has been reached, each Bareos component will reject all new connections.
+
.. _ConcurrentJobs:
.. _section-Interleaving:
@@ -87,7 +89,10 @@ Another thing to check is to ensure that the Bareos component you are trying to
Concurrent Jobs
---------------
-:index:`\ <single: Job; Concurrent Jobs>`\ :index:`\ <single: Running Concurrent Jobs>`\ :index:`\ <single: Concurrent Jobs>`\
+.. index::
+ single: Job; Concurrent Jobs
+ single: Running Concurrent Jobs
+ single: Concurrent Jobs
Bareos can run multiple concurrent jobs. Using the :strong:`Maximum Concurrent Jobs`\ directives, you can configure how many and which jobs can be run simultaneously:
@@ -127,19 +132,19 @@ Below is a super stripped down :file:`bareos-dir.conf` file showing you the four
# Bareos Director Configuration file -- bareos-dir.conf
#
Director {
- Name = rufus-dir
+ Name = bareos-dir
Maximum Concurrent Jobs = 4
...
}
Job {
Name = "NightlySave"
Maximum Concurrent Jobs = 4
- Client = rufus-fd
+ Client = bareos-fd
Storage = File
...
}
Client {
- Name = rufus-fd
+ Name = bareos-fd
Maximum Concurrent Jobs = 4
...
}
@@ -149,58 +154,18 @@ Below is a super stripped down :file:`bareos-dir.conf` file showing you the four
...
}
-Media VolWrites: integer out of range
--------------------------------------
-
-:index:`\ <single: Errors; integer out of range>`\ :index:`\ <single: Catalog; Media; VolWrites>`\
-
-In some situation, you receive an error message similar to this:
-
-.. code-block:: bconsole
-
- 12-Apr 15:10 bareos-dir JobId 15860: Fatal error: Catalog error updating Media record. sql_update.c:385 update UPDATE Media SET VolJobs=12,VolFiles=10,VolBlocks=155013,VolBytes=10000263168,VolMounts=233,VolErrors=0,VolWrites=2147626019,MaxVolBytes=0,VolStatus='Append',Slot=1,InChanger=1,VolReadTime=0,VolWriteTime=842658562655,LabelType=0,StorageId=3,PoolId=2,VolRetention=144000,VolUseDuration=82800,MaxVolJobs=0,MaxVolFiles=0,Enabled=1,LocationId=0,ScratchPoolId=0,RecyclePoolId=0,RecycleCount=201,Recycle=1,ActionOnPurge=0,MinBlocksize=0,MaxBlocksize=0 WHERE VolumeName='000194L5' failed:
- ERROR: integer out of range
-
-The database column **VolWrites** in the **Media** table stores the number of write accesses to a volume. It is only used for statistics.
-
-However, it has happened that the number of write accesses exceeds the maximum value supported by the database column (on |postgresql| it is currently 2147483647, 32 bit, signed integer). The result is a database error, similar to the one mentioned above.
-
-As a temporary fix, just reset this counter:
-
-.. code-block:: bconsole
- :caption: Reset the VolWrites counter
-
- 1000 OK: bareos-dir Version: 17.2.5 (14 Feb 2018)
- Enter a period to cancel a command.
- *<input>sqlquery</input>
- Automatically selected Catalog: MyCatalog
- Using Catalog "MyCatalog"
- Entering SQL query mode.
- Terminate each query with a semicolon.
- Terminate query mode with a blank line.
- Enter SQL query: <input>UPDATE Media SET VolWrites = 0 WHERE VolWrites > '2000000000';</input>
- No results to list.
- SELECT volwrites FROM media; volwrites > '0';
- +-----------+
- | volwrites |
- +-----------+
- | 0 |
- | 0 |
- | 0 |
- | 0 |
- +-----------+
- Enter SQL query:
-
-In the long run, it is planed to modify the database schema to enable storing much larger numbers.
.. _AnsiLabelsChapter:
Tape Labels: ANSI or IBM
------------------------
-:index:`\ <single: Label; Tape Labels>`\ :index:`\ <single: Tape; Label; ANSI>`\ :index:`\ <single: Tape; Label; IBM>`\
+.. index::
+ single: Label; Tape Labels
+ single: Tape; Label; ANSI
+ single: Tape; Label; IBM
-By default, Bareos uses its own tape label (see :ref:`backward-compatibility-tape-format` and :config:option:`dir/pool/LabelType`\ ). However, Bareos also supports reading and write ANSI and IBM tape labels.
+By default, Bareos uses its own tape label (see :ref:`backward-compatibility-tape-format` and :config:option:`dir/pool/LabelType`\ ). However, Bareos also supports reading and writing ANSI and IBM tape labels.
Reading
~~~~~~~
@@ -223,7 +188,8 @@ If you have labeled your volumes outside of Bareos, then the ANSI/IBM label will
Tape Drive
----------
-:index:`\ <single: Problem; Tape>`\
+.. index::
+ single: Problem; Tape
This chapter is concerned with testing and configuring your tape drive to make sure that it will work properly with Bareos using the btape program.
@@ -252,7 +218,7 @@ Do not proceed to the next item until you have succeeded with the previous one.
#. Make sure you have a valid and correct Device resource corresponding to your drive. For Linux users, generally, the default one works. For FreeBSD users, there are two possible Device configurations (see below). For other drives and/or OSes, you will need to first ensure that your system tape modes are properly setup (see below), then possibly modify you Device resource depending on the output from the btape program (next item). When doing this, you should consult the
:ref:`Storage Daemon Configuration <StoredConfChapter>` of this manual.
-#. If you are using a Fibre Channel to connect your tape drive to Bareos, please be sure to disable any caching in the NSR (network storage router, which is a Fibre Channel to SCSI converter).
+#. If you are using a Fiber Channel to connect your tape drive to Bareos, please be sure to disable any caching in the NSR (network storage router, which is a Fiber Channel to SCSI converter).
#. Run the btape test command:
@@ -313,7 +279,12 @@ Testing Autochanger and Adapting mtx-changer script
.. _section-MtxChangerManualUsage:
- :index:`\ <single: Autochanger; Testing>`\ :index:`\ <single: Autochanger; mtx-changer>`\ :index:`\ <single: Command; mtx-changer>`\ :index:`\ <single: Problem; Autochanger>`\ :index:`\ <single: Problem; mtx-changer>`\
+.. index::
+ single: Autochanger; Testing
+ single: Autochanger; mtx-changer
+ single: Command; mtx-changer
+ single: Problem; Autochanger
+ single: Problem; mtx-changer
In case, Bareos does not work well with the Autochanger, it is preferable to "hand-test" that the changer works. To do so, we suggest you do the following commands:
@@ -321,7 +292,8 @@ Make sure Bareos is not running.
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 list 0 /dev/nst0 0`
-:index:`\ <single: mtx-changer list>`\
+.. index::
+ single: mtx-changer list
This command should print:
@@ -340,7 +312,8 @@ or one number per line for each slot that is occupied in your changer, and the n
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 listall 0 /dev/nst0 0`
-:index:`\ <single: mtx-changer listall>`\
+.. index::
+ single: mtx-changer listall
This command should print:
@@ -367,37 +340,43 @@ This command should print:
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 transfer 1 2`
-:index:`\ <single: mtx-changer listall>`\
+.. index::
+ single: mtx-changer transfer
This command should transfer a volume from source (1) to destination (2)
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 slots`
-:index:`\ <single: mtx-changer slots>`\
+.. index::
+ single: mtx-changer slots
This command should return the number of slots in your autochanger.
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 unload 1 /dev/nst0 0`
-:index:`\ <single: mtx-changer unload>`\
+.. index::
+ single: mtx-changer unload
If a tape is loaded from slot 1, this should cause it to be unloaded.
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 load 3 /dev/nst0 0`
-:index:`\ <single: mtx-changer load>`\
+.. index::`
+ single: mtx-changer load
Assuming you have a tape in slot 3, it will be loaded into drive (0).
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 loaded 0 /dev/nst0 0`
-:index:`\ <single: mtx-changer loaded>`\
+.. index::
+ single: mtx-changer loaded
It should print "3" Note, we have used an "illegal" slot number 0. In this case, it is simply ignored because the slot number is not used. However, it must be specified because the drive parameter at the end of the command is needed to select the correct drive.
:command:`/usr/lib/bareos/scripts/mtx-changer /dev/sg0 unload 3 /dev/nst0 0`
-:index:`\ <single: mtx-changer unload>`\
+.. index::
+ single: mtx-changer unload
will unload the tape into slot 3.
@@ -406,7 +385,8 @@ Once all the above commands work correctly, assuming that you have the right Cha
-::
+.. code-block:: sh
+ :caption: Testing if sleep is needed between unload and load
#!/bin/sh
/usr/lib/bareos/scripts/mtx-changer /dev/sg0 unload 1 /dev/nst0 0
@@ -422,7 +402,8 @@ A second problem that comes up with a small number of autochangers is that they
-::
+.. code-block:: sh
+ :caption: Testing if offline is needed
#!/bin/sh
/usr/lib/bareos/scripts/mtx-changer /dev/sg0 unload 1 /dev/nst0 0
@@ -441,13 +422,18 @@ Restore
Restore a pruned job using a pattern
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-:index:`\ <single: Restore; pruned job>`\ :index:`\ <single: Problem; Restore; pruned job>`\ :index:`\ <single: Regex>`\
+.. index::
+ single: Restore; pruned job
+ single: Problem; Restore; pruned file job
+ single: Regex
It is possible to configure Bareos in a way, that job information are still stored in the Bareos catalog, while the individual file information are already pruned.
If all File records are pruned from the catalog for a Job, normally Bareos can restore only all files saved. That is there is no way using the catalog to select individual files. With this new feature, Bareos will ask if you want to specify a Regex expression for extracting only a part of the full backup.
-::
+.. code-block:: bconsole
+ :caption: Restoring pruned files job using regex to filter
+
Building directory tree for JobId(s) 1,3 ...
There were no files inserted into the tree, so file selection
@@ -463,13 +449,16 @@ See also :ref:`FileRegex bsr option <FileRegex>` for more information.
Problems Restoring Files
~~~~~~~~~~~~~~~~~~~~~~~~
-:index:`\ <single: Restore; Files; Problem>`\ :index:`\ <single: Problem; Restoring Files>`\ :index:`\ <single: Problem; Tape; fixed mode>`\ :index:`\ <single: Problem; Tape; variable mode>`\
+.. index::
+ single: Restore; Files; Problem
+ single: Problem; Restoring Files
+ single: Problem; Tape; fixed mode
+ single: Problem; Tape; variable mode
The most frequent problems users have restoring files are error messages such as:
-
-::
+.. code-block:: none
04-Jan 00:33 z217-sd: RestoreFiles.2005-01-04_00.31.04 Error:
block.c:868 Volume data error at 20:0! Short block of 512 bytes on
@@ -480,8 +469,7 @@ The most frequent problems users have restoring files are error messages such as
or
-
-::
+.. code-block:: none
04-Jan 00:33 z217-sd: RestoreFiles.2005-01-04_00.31.04 Error:
block.c:264 Volume data error at 20:0! Wanted ID: "BB02", got ".".
@@ -501,10 +489,13 @@ Try the following things, each separately, and reset your Device resource to wha
#. Use bextract to extract the files you want – it reads the Volume sequentially if you use the include list feature, or if you use a .bsr file, but remove all the VolBlock statements after the .bsr file is created (at the Run yes/mod/no) prompt but before you start the restore.
+
Restoring Files Can Be Slow
~~~~~~~~~~~~~~~~~~~~~~~~~~~
-:index:`\ <single: Restore; slow>`\ :index:`\ <single: Problem; Restore; slow>`\
+.. index::
+ single: Restore; slow
+ single: Problem; Restore; slow
Restoring files is generally much slower than backing them up for several reasons. The first is that during a backup the tape is normally already positioned and Bareos only needs to write. On the other hand, because restoring files is done so rarely, Bareos keeps only the start file and block on the tape for the whole job rather than on a file by file basis which would use quite a lot of space in the catalog.
@@ -514,6 +505,7 @@ Finally, instead of just reading a file for backup, during the restore, Bareos m
For all the above reasons the restore process is generally much slower than backing up (sometimes it takes three times as long).
+
.. _section-RestoreCatalog:
Restoring When Things Go Wrong
@@ -555,9 +547,9 @@ Solution with a Catalog backup
Where: /tmp/bareos-restores
Replace: always
FileSet: Full Set
- Client: rufus-fd
+ Client: bareos-fd
Storage: File
- When: 2005-07-10 17:33:40
+ When: 2012-12-12 17:33:40
Catalog: MyCatalog
Priority: 10
OK to run? (yes/mod/no):
@@ -579,26 +571,26 @@ Solution with a Job listing
::
22-Apr 10:22 HeadMan: Start Backup JobId 7510,
- Job=CatalogBackup.2005-04-22_01.10.0
- 22-Apr 10:23 HeadMan: Bareos 1.37.14 (21Apr05): 22-Apr-2005 10:23:06
+ Job=CatalogBackup.2015-04-22_01.10.0
+ 22-Apr 10:23 HeadMan: Bareos 14.2.8 (21Apr15): 22-Apr-2015 10:23:06
JobId: 7510
- Job: CatalogBackup.2005-04-22_01.10.00
+ Job: CatalogBackup.2015-04-22_01.10.00
Backup Level: Full
- Client: Polymatou
- FileSet: "CatalogFile" 2003-04-10 01:24:01
+ Client: bareos-fd
+ FileSet: "CatalogFile" 2013-04-10 01:24:01
Pool: "Default"
Storage: "DLTDrive"
- Start time: 22-Apr-2005 10:21:00
- End time: 22-Apr-2005 10:23:06
+ Start time: 22-Apr-2015 10:21:00
+ End time: 22-Apr-2015 10:23:06
FD Files Written: 1
SD Files Written: 1
FD Bytes Written: 210,739,395
SD Bytes Written: 210,739,521
Rate: 1672.5 KB/s
Software Compression: None
- Volume name(s): DLT-22Apr05
+ Volume name(s): DLT-22Apr15
Volume Session Id: 11
- Volume Session Time: 1114075126
+ Volume Session Time: 1429607926
Last Volume Bytes: 1,428,240,465
Non-fatal FD errors: 0
SD Errors: 0
@@ -614,9 +606,9 @@ Solution with a Job listing
::
- Volume="DLT-22Apr05"
+ Volume="DLT-22Apr15"
VolSessionId=11
- VolSessionTime=1114075126
+ VolSessionTime=1429607926
FileIndex=1-1
@@ -629,9 +621,9 @@ Solution with a Job listing
::
- Volume="DLT-22Apr05"
+ Volume="DLT-22Apr15"
VolSessionId=11
- VolSessionTime=1114075126
+ VolSessionTime=1429607926
VolFile=118-118
VolBlock=0-4053
FileIndex=1-1
@@ -718,7 +710,7 @@ Solution
::
- ./bls -j -V DLT-22Apr05 /dev/nst0
+ ./bls -j -V DLT-22Apr15 /dev/nst0
@@ -727,20 +719,20 @@ Solution
::
bls: butil.c:258 Using device: "/dev/nst0" for reading.
- 21-Jul 18:34 bls: Ready to read from volume "DLT-22Apr05" on device "DLTDrive"
+ 21-Jul 18:34 bls: Ready to read from volume "DLT-22Apr15" on device "DLTDrive"
(/dev/nst0).
- Volume Record: File:blk=0:0 SessId=11 SessTime=1114075126 JobId=0 DataLen=164
+ Volume Record: File:blk=0:0 SessId=11 SessTime=1429607926 JobId=0 DataLen=164
...
- Begin Job Session Record: File:blk=118:0 SessId=11 SessTime=1114075126
+ Begin Job Session Record: File:blk=118:0 SessId=11 SessTime=1429607926
JobId=7510
- Job=CatalogBackup.2005-04-22_01.10.0 Date=22-Apr-2005 10:21:00 Level=F Type=B
- End Job Session Record: File:blk=118:4053 SessId=11 SessTime=1114075126
+ Job=CatalogBackup.2015-04-22_01.10.0 Date=22-Apr-2015 10:21:00 Level=F Type=B
+ End Job Session Record: File:blk=118:4053 SessId=11 SessTime=1429607926
JobId=7510
Date=22-Apr-2005 10:23:06 Level=F Type=B Files=1 Bytes=210,739,395 Errors=0
Status=T
...
21-Jul 18:34 bls: End of Volume at file 201 on device "DLTDrive" (/dev/nst0),
- Volume "DLT-22Apr05"
+ Volume "DLT-22Apr15"
21-Jul 18:34 bls: End of all volumes.