Blob Blame History Raw
From 14ebc28e07e68ff412aa42f7d8b67969e2f63d00 Mon Sep 17 00:00:00 2001
From: Matthew Wilcox <mawilcox@microsoft.com>
Date: Fri, 22 Dec 2017 06:32:16 -0800
Subject: [PATCH] errseq: Add to documentation tree
Git-commit: 14ebc28e07e68ff412aa42f7d8b67969e2f63d00
Patch-mainline: v4.16-rc1
References: bsc#1107008

 - Move errseq.rst into core-api
 - Add errseq to the core-api index
 - Promote the header to a more prominent header type, otherwise we get three
   entries in the table of contents.
 - Reformat the table to look nicer and be a little more proportional in
   terms of horizontal width per bit (the SF bit is still disproportionately
   large, but there's no way to fix that).
 - Include errseq kernel-doc in the errseq.rst
 - Neaten some kernel-doc markup

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: Jan Kara <jack@suse.cz>

---
 Documentation/core-api/errseq.rst |  159 ++++++++++++++++++++++++++++++++++++++
 Documentation/core-api/index.rst  |    1 
 Documentation/errseq.rst          |  149 -----------------------------------
 include/linux/errseq.h            |    2 
 lib/errseq.c                      |   37 +++++---
 5 files changed, 182 insertions(+), 166 deletions(-)
 create mode 100644 Documentation/core-api/errseq.rst
 delete mode 100644 Documentation/errseq.rst

--- /dev/null
+++ b/Documentation/core-api/errseq.rst
@@ -0,0 +1,159 @@
+=====================
+The errseq_t datatype
+=====================
+
+An errseq_t is a way of recording errors in one place, and allowing any
+number of "subscribers" to tell whether it has changed since a previous
+point where it was sampled.
+
+The initial use case for this is tracking errors for file
+synchronization syscalls (fsync, fdatasync, msync and sync_file_range),
+but it may be usable in other situations.
+
+It's implemented as an unsigned 32-bit value.  The low order bits are
+designated to hold an error code (between 1 and MAX_ERRNO).  The upper bits
+are used as a counter.  This is done with atomics instead of locking so that
+these functions can be called from any context.
+
+Note that there is a risk of collisions if new errors are being recorded
+frequently, since we have so few bits to use as a counter.
+
+To mitigate this, the bit between the error value and counter is used as
+a flag to tell whether the value has been sampled since a new value was
+recorded.  That allows us to avoid bumping the counter if no one has
+sampled it since the last time an error was recorded.
+
+Thus we end up with a value that looks something like this:
+
++--------------------------------------+----+------------------------+
+| 31..13                               | 12 | 11..0                  |
++--------------------------------------+----+------------------------+
+| counter                              | SF | errno                  |
++--------------------------------------+----+------------------------+
+
+The general idea is for "watchers" to sample an errseq_t value and keep
+it as a running cursor.  That value can later be used to tell whether
+any new errors have occurred since that sampling was done, and atomically
+record the state at the time that it was checked.  This allows us to
+record errors in one place, and then have a number of "watchers" that
+can tell whether the value has changed since they last checked it.
+
+A new errseq_t should always be zeroed out.  An errseq_t value of all zeroes
+is the special (but common) case where there has never been an error. An all
+zero value thus serves as the "epoch" if one wishes to know whether there
+has ever been an error set since it was first initialized.
+
+API usage
+=========
+
+Let me tell you a story about a worker drone.  Now, he's a good worker
+overall, but the company is a little...management heavy.  He has to
+report to 77 supervisors today, and tomorrow the "big boss" is coming in
+from out of town and he's sure to test the poor fellow too.
+
+They're all handing him work to do -- so much he can't keep track of who
+handed him what, but that's not really a big problem.  The supervisors
+just want to know when he's finished all of the work they've handed him so
+far and whether he made any mistakes since they last asked.
+
+He might have made the mistake on work they didn't actually hand him,
+but he can't keep track of things at that level of detail, all he can
+remember is the most recent mistake that he made.
+
+Here's our worker_drone representation::
+
+        struct worker_drone {
+                errseq_t        wd_err; /* for recording errors */
+        };
+
+Every day, the worker_drone starts out with a blank slate::
+
+        struct worker_drone wd;
+
+        wd.wd_err = (errseq_t)0;
+
+The supervisors come in and get an initial read for the day.  They
+don't care about anything that happened before their watch begins::
+
+        struct supervisor {
+                errseq_t        s_wd_err; /* private "cursor" for wd_err */
+                spinlock_t      s_wd_err_lock; /* protects s_wd_err */
+        }
+
+        struct supervisor       su;
+
+        su.s_wd_err = errseq_sample(&wd.wd_err);
+        spin_lock_init(&su.s_wd_err_lock);
+
+Now they start handing him tasks to do.  Every few minutes they ask him to
+finish up all of the work they've handed him so far.  Then they ask him
+whether he made any mistakes on any of it::
+
+        spin_lock(&su.su_wd_err_lock);
+        err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
+        spin_unlock(&su.su_wd_err_lock);
+
+Up to this point, that just keeps returning 0.
+
+Now, the owners of this company are quite miserly and have given him
+substandard equipment with which to do his job. Occasionally it
+glitches and he makes a mistake.  He sighs a heavy sigh, and marks it
+down::
+
+        errseq_set(&wd.wd_err, -EIO);
+
+...and then gets back to work.  The supervisors eventually poll again
+and they each get the error when they next check.  Subsequent calls will
+return 0, until another error is recorded, at which point it's reported
+to each of them once.
+
+Note that the supervisors can't tell how many mistakes he made, only
+whether one was made since they last checked, and the latest value
+recorded.
+
+Occasionally the big boss comes in for a spot check and asks the worker
+to do a one-off job for him. He's not really watching the worker
+full-time like the supervisors, but he does need to know whether a
+mistake occurred while his job was processing.
+
+He can just sample the current errseq_t in the worker, and then use that
+to tell whether an error has occurred later::
+
+        errseq_t since = errseq_sample(&wd.wd_err);
+        /* submit some work and wait for it to complete */
+        err = errseq_check(&wd.wd_err, since);
+
+Since he's just going to discard "since" after that point, he doesn't
+need to advance it here. He also doesn't need any locking since it's
+not usable by anyone else.
+
+Serializing errseq_t cursor updates
+===================================
+
+Note that the errseq_t API does not protect the errseq_t cursor during a
+check_and_advance_operation. Only the canonical error code is handled
+atomically.  In a situation where more than one task might be using the
+same errseq_t cursor at the same time, it's important to serialize
+updates to that cursor.
+
+If that's not done, then it's possible for the cursor to go backward
+in which case the same error could be reported more than once.
+
+Because of this, it's often advantageous to first do an errseq_check to
+see if anything has changed, and only later do an
+errseq_check_and_advance after taking the lock. e.g.::
+
+        if (errseq_check(&wd.wd_err, READ_ONCE(su.s_wd_err)) {
+                /* su.s_wd_err is protected by s_wd_err_lock */
+                spin_lock(&su.s_wd_err_lock);
+                err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
+                spin_unlock(&su.s_wd_err_lock);
+        }
+
+That avoids the spinlock in the common case where nothing has changed
+since the last time it was checked.
+
+Functions
+=========
+
+.. kernel-doc:: lib/errseq.c
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -19,6 +19,7 @@ Core utilities
    workqueue
    genericirq
    flexible-arrays
+   errseq
 
 Interfaces for kernel debugging
 ===============================
--- a/Documentation/errseq.rst
+++ /dev/null
@@ -1,149 +0,0 @@
-The errseq_t datatype
-=====================
-An errseq_t is a way of recording errors in one place, and allowing any
-number of "subscribers" to tell whether it has changed since a previous
-point where it was sampled.
-
-The initial use case for this is tracking errors for file
-synchronization syscalls (fsync, fdatasync, msync and sync_file_range),
-but it may be usable in other situations.
-
-It's implemented as an unsigned 32-bit value.  The low order bits are
-designated to hold an error code (between 1 and MAX_ERRNO).  The upper bits
-are used as a counter.  This is done with atomics instead of locking so that
-these functions can be called from any context.
-
-Note that there is a risk of collisions if new errors are being recorded
-frequently, since we have so few bits to use as a counter.
-
-To mitigate this, the bit between the error value and counter is used as
-a flag to tell whether the value has been sampled since a new value was
-recorded.  That allows us to avoid bumping the counter if no one has
-sampled it since the last time an error was recorded.
-
-Thus we end up with a value that looks something like this::
-
-    bit:  31..13        12        11..0
-    +-----------------+----+----------------+
-    |     counter     | SF |      errno     |
-    +-----------------+----+----------------+
-
-The general idea is for "watchers" to sample an errseq_t value and keep
-it as a running cursor.  That value can later be used to tell whether
-any new errors have occurred since that sampling was done, and atomically
-record the state at the time that it was checked.  This allows us to
-record errors in one place, and then have a number of "watchers" that
-can tell whether the value has changed since they last checked it.
-
-A new errseq_t should always be zeroed out.  An errseq_t value of all zeroes
-is the special (but common) case where there has never been an error. An all
-zero value thus serves as the "epoch" if one wishes to know whether there
-has ever been an error set since it was first initialized.
-
-API usage
-=========
-Let me tell you a story about a worker drone.  Now, he's a good worker
-overall, but the company is a little...management heavy.  He has to
-report to 77 supervisors today, and tomorrow the "big boss" is coming in
-from out of town and he's sure to test the poor fellow too.
-
-They're all handing him work to do -- so much he can't keep track of who
-handed him what, but that's not really a big problem.  The supervisors
-just want to know when he's finished all of the work they've handed him so
-far and whether he made any mistakes since they last asked.
-
-He might have made the mistake on work they didn't actually hand him,
-but he can't keep track of things at that level of detail, all he can
-remember is the most recent mistake that he made.
-
-Here's our worker_drone representation::
-
-        struct worker_drone {
-                errseq_t        wd_err; /* for recording errors */
-        };
-
-Every day, the worker_drone starts out with a blank slate::
-
-        struct worker_drone wd;
-
-        wd.wd_err = (errseq_t)0;
-
-The supervisors come in and get an initial read for the day.  They
-don't care about anything that happened before their watch begins::
-
-        struct supervisor {
-                errseq_t        s_wd_err; /* private "cursor" for wd_err */
-                spinlock_t      s_wd_err_lock; /* protects s_wd_err */
-        }
-
-        struct supervisor       su;
-
-        su.s_wd_err = errseq_sample(&wd.wd_err);
-        spin_lock_init(&su.s_wd_err_lock);
-
-Now they start handing him tasks to do.  Every few minutes they ask him to
-finish up all of the work they've handed him so far.  Then they ask him
-whether he made any mistakes on any of it::
-
-        spin_lock(&su.su_wd_err_lock);
-        err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
-        spin_unlock(&su.su_wd_err_lock);
-
-Up to this point, that just keeps returning 0.
-
-Now, the owners of this company are quite miserly and have given him
-substandard equipment with which to do his job. Occasionally it
-glitches and he makes a mistake.  He sighs a heavy sigh, and marks it
-down::
-
-        errseq_set(&wd.wd_err, -EIO);
-
-...and then gets back to work.  The supervisors eventually poll again
-and they each get the error when they next check.  Subsequent calls will
-return 0, until another error is recorded, at which point it's reported
-to each of them once.
-
-Note that the supervisors can't tell how many mistakes he made, only
-whether one was made since they last checked, and the latest value
-recorded.
-
-Occasionally the big boss comes in for a spot check and asks the worker
-to do a one-off job for him. He's not really watching the worker
-full-time like the supervisors, but he does need to know whether a
-mistake occurred while his job was processing.
-
-He can just sample the current errseq_t in the worker, and then use that
-to tell whether an error has occurred later::
-
-        errseq_t since = errseq_sample(&wd.wd_err);
-        /* submit some work and wait for it to complete */
-        err = errseq_check(&wd.wd_err, since);
-
-Since he's just going to discard "since" after that point, he doesn't
-need to advance it here. He also doesn't need any locking since it's
-not usable by anyone else.
-
-Serializing errseq_t cursor updates
-===================================
-Note that the errseq_t API does not protect the errseq_t cursor during a
-check_and_advance_operation. Only the canonical error code is handled
-atomically.  In a situation where more than one task might be using the
-same errseq_t cursor at the same time, it's important to serialize
-updates to that cursor.
-
-If that's not done, then it's possible for the cursor to go backward
-in which case the same error could be reported more than once.
-
-Because of this, it's often advantageous to first do an errseq_check to
-see if anything has changed, and only later do an
-errseq_check_and_advance after taking the lock. e.g.::
-
-        if (errseq_check(&wd.wd_err, READ_ONCE(su.s_wd_err)) {
-                /* su.s_wd_err is protected by s_wd_err_lock */
-                spin_lock(&su.s_wd_err_lock);
-                err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
-                spin_unlock(&su.s_wd_err_lock);
-        }
-
-That avoids the spinlock in the common case where nothing has changed
-since the last time it was checked.
--- a/include/linux/errseq.h
+++ b/include/linux/errseq.h
@@ -1,5 +1,5 @@
 /*
- * See Documentation/errseq.rst and lib/errseq.c
+ * See Documentation/core-api/errseq.rst and lib/errseq.c
  */
 #ifndef _LINUX_ERRSEQ_H
 #define _LINUX_ERRSEQ_H
--- a/lib/errseq.c
+++ b/lib/errseq.c
@@ -45,14 +45,14 @@
  * @eseq: errseq_t field that should be set
  * @err: error to set (must be between -1 and -MAX_ERRNO)
  *
- * This function sets the error in *eseq, and increments the sequence counter
+ * This function sets the error in @eseq, and increments the sequence counter
  * if the last sequence was sampled at some point in the past.
  *
  * Any error set will always overwrite an existing error.
  *
- * We do return the latest value here, primarily for debugging purposes. The
- * return value should not be used as a previously sampled value in later calls
- * as it will not have the SEEN flag set.
+ * Return: The previous value, primarily for debugging purposes. The
+ * return value should not be used as a previously sampled value in later
+ * calls as it will not have the SEEN flag set.
  */
 errseq_t errseq_set(errseq_t *eseq, int err)
 {
@@ -107,11 +107,13 @@ errseq_t errseq_set(errseq_t *eseq, int
 EXPORT_SYMBOL(errseq_set);
 
 /**
- * errseq_sample - grab current errseq_t value
- * @eseq: pointer to errseq_t to be sampled
+ * errseq_sample() - Grab current errseq_t value.
+ * @eseq: Pointer to errseq_t to be sampled.
  *
  * This function allows callers to sample an errseq_t value, marking it as
  * "seen" if required.
+ *
+ * Return: The current errseq value.
  */
 errseq_t errseq_sample(errseq_t *eseq)
 {
@@ -133,15 +135,15 @@ errseq_t errseq_sample(errseq_t *eseq)
 EXPORT_SYMBOL(errseq_sample);
 
 /**
- * errseq_check - has an error occurred since a particular sample point?
- * @eseq: pointer to errseq_t value to be checked
- * @since: previously-sampled errseq_t from which to check
+ * errseq_check() - Has an error occurred since a particular sample point?
+ * @eseq: Pointer to errseq_t value to be checked.
+ * @since: Previously-sampled errseq_t from which to check.
  *
- * Grab the value that eseq points to, and see if it has changed "since"
- * the given value was sampled. The "since" value is not advanced, so there
+ * Grab the value that eseq points to, and see if it has changed @since
+ * the given value was sampled. The @since value is not advanced, so there
  * is no need to mark the value as seen.
  *
- * Returns the latest error set in the errseq_t or 0 if it hasn't changed.
+ * Return: The latest error set in the errseq_t or 0 if it hasn't changed.
  */
 int errseq_check(errseq_t *eseq, errseq_t since)
 {
@@ -154,11 +156,11 @@ int errseq_check(errseq_t *eseq, errseq_
 EXPORT_SYMBOL(errseq_check);
 
 /**
- * errseq_check_and_advance - check an errseq_t and advance to current value
- * @eseq: pointer to value being checked and reported
- * @since: pointer to previously-sampled errseq_t to check against and advance
+ * errseq_check_and_advance() - Check an errseq_t and advance to current value.
+ * @eseq: Pointer to value being checked and reported.
+ * @since: Pointer to previously-sampled errseq_t to check against and advance.
  *
- * Grab the eseq value, and see whether it matches the value that "since"
+ * Grab the eseq value, and see whether it matches the value that @since
  * points to. If it does, then just return 0.
  *
  * If it doesn't, then the value has changed. Set the "seen" flag, and try to
@@ -169,6 +171,9 @@ EXPORT_SYMBOL(errseq_check);
  * value. The caller must provide that if necessary. Because of this, callers
  * may want to do a lockless errseq_check before taking the lock and calling
  * this.
+ *
+ * Return: Negative errno if one has been stored, or 0 if no new error has
+ * occurred.
  */
 int errseq_check_and_advance(errseq_t *eseq, errseq_t *since)
 {