Commit Graph

4663 Commits (24d4e7f642882a8a13da170b4ba86eec8fa91bf2)

Author SHA1 Message Date
Rafael J. Wysocki bfaa07bc32 Merge branch 'pm-drivers'
* pm-drivers:
  rtc-cmos: report wakeups from interrupt handler
  PM / crypto / ux500: Use struct dev_pm_ops for power management
  PM / IPMI: Remove empty legacy PCI PM callbacks
  tpm_nsc: Use struct dev_pm_ops for power management
  tpm_tis: Use struct dev_pm_ops for power management
  tpm_atmel: Use struct dev_pm_ops for power management
  PM / TPM: Drop unused pm_message_t argument from tpm_pm_suspend()
  omap-rng: Use struct dev_pm_ops for power management
  mg_disk: Use struct dev_pm_ops for power management
  msi-laptop: Use struct dev_pm_ops for power management
  hdaps: Use struct dev_pm_ops for power management
  sonypi: Use struct dev_pm_ops for power management
  intel_mid_thermal: Use struct dev_pm_ops for power management
  acer-wmi: Use struct dev_pm_ops for power management
  intel_ips: Remove empty legacy PM callbacks
  thinkpad_acpi: Use struct dev_pm_ops instead of legacy PM routines
  thinkpad_acpi: Drop pm_message_t arguments from suspend routines
2012-07-19 00:03:42 +02:00
Dan Carpenter 6a3ca4f188 rbd: endian bug in rbd_req_cb()
Sparse complains about this because:
drivers/block/rbd.c:996:20: warning: cast to restricted __le32
drivers/block/rbd.c:996:20: warning: cast from restricted __le16

These are set in osd_req_encode_op() and they are le16.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Alex Elder <elder@inktank.com>
(cherry picked from commit 895cfcc810)
2012-07-17 21:30:31 -07:00
Yan, Zheng 236df3755d rbd: Fix ceph_snap_context size calculation
ceph_snap_context->snaps is an u64 array

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Reviewed-by: Alex Elder <elder@inktank.com>
(cherry picked from commit f9f9a19044)
2012-07-17 21:30:19 -07:00
Silva Paulo 68d740d79c blk: fix wrong idr_pre_get() error check in loop.c
The idr_pre_get() function never returns a value < 0.  It returns 0 (no
memory) or 1 (OK).

Reported-by: Silva Paulo <psdasilva@yahoo.com>
[ Rewrote Silva's patch, but attributing it to Silva anyway  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-14 15:39:58 -07:00
Rafael J. Wysocki 156ffcb42a mg_disk: Use struct dev_pm_ops for power management
Make the mg_disk driver define its PM callbacks through
a struct dev_pm_ops object rather than by using legacy PM hooks
in struct platform_driver.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-07-06 19:07:00 +02:00
Andi Kleen 0cc15d03bc floppy: Run floppy initialization asynchronous
floppy_init is quite slow, 3s on my test system to determine
that there is no floppy. Run it asynchronous to the other
init calls to improve boot time.

[jkosina@suse.cz: fix modular build]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-07-04 14:01:40 +02:00
Linus Torvalds dab058fd5f floppy: cancel any pending fd_timeouts before adding a new one
In commit 070ad7e793 ("floppy: convert to delayed work and
single-thread wq") the 'fd_timeout' timer was converted to a delayed
work.  However, the "del_timer(&fd_timeout)" was lost in the process,
and any previous pending timeouts would stay active when we then
re-queued the timeout.

This resulted in the floppy probe sequence having a (stale) 20s timeout
rather than the intended 3s timeout, and thus made booting with the
floppy driver (but no actual floppy controller) take much longer than it
should.

Of course, there's little reason for most people to compile the floppy
driver into the kernel at all, which is why most people never noticed.

Canceling the delayed work where we used to do the del_timer() fixes the
issue, and makes the floppy probing use the proper new timeout instead.
The three second timeout is still very wasteful, but better than the 20s
one.

Reported-and-tested-by: Andi Kleen <ak@linux.intel.com>
Reported-and-tested-by: Calvin Walton <calvin.walton@kepstin.ca>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-03 15:51:22 -07:00
Sage Weil 9a64e8e0ac Linux 3.5-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJPyr4LAAoJEHm+PkMAQRiGhvMH/1uXaDmJiiyAtMhC9kQbLclK
 5RpUOV+ukRrPXBJhwWGEZvC9G/DiWAfZ/19Ee6qTGZbA46yxkgZklqO+bw7fuOLH
 dPf4MNXdhgOgbs0KkVAk6aXIYzIU836pcYg+LcapG8E8SZp3SWbJzrVbUPFwPM+m
 Sv11ZcpJfM2HH9wFRdKErUOiZHsMY+LZHcw0nx+BObytjgzBbzHNkpF57F714TUO
 QplYpIToO3XtGhIM1yRDxww+2zFlVNsCZ8IC57EDbLb8BMZWuyZoFgWZqLAnrU0u
 vy7CHLledMSvs855juJ9JxGo/EDnfwJpCnjmcp8BY+h4b5T/k5mGK6d9aeXYRf4=
 =CcWn
 -----END PGP SIGNATURE-----

Merge tag 'v3.5-rc1'

Linux 3.5-rc1

Conflicts:
	net/ceph/messenger.c
2012-06-15 12:32:04 -07:00
Jens Axboe 6d407cfaf5 Merge branch 'for-jens' of git://git.drbd.org/linux-drbd into for-linus 2012-06-13 21:19:42 +02:00
Jens Axboe 987751719b Merge branch 'stable/for-jens-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus 2012-06-13 21:19:06 +02:00
Tao Guo 32587371ad umem: fix up unplugging
Fix a regression introduced by 7eaceaccab ("block: remove per-queue
plugging").  In that patch, Jens removed the whole mm_unplug_device()
function, which used to be the trigger to make umem start to work.

We need to implement unplugging to make umem start to work, or I/O will
never be triggered.

Signed-off-by: Tao Guo <Tao.Guo@emc.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-13 21:17:21 +02:00
Lars Ellenberg 0d5934e3c2 drbd: fix null pointer dereference with on-congestion policy when diskless
We must not look at mdev->actlog, unless we have a get_ldev() reference.
It also does not make much sense to try to disconnect or pull-ahead of
the peer, if we don't have good local data.

Only even consider congestion policies, if our local disk is D_UP_TO_DATE.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-06-12 14:35:19 +02:00
Lars Ellenberg 1ed25b269e drbd: fix list corruption by failing but already aborted reads
If a read is aborted due to force-detach of a supposedly unresponsive
local backing device, and retried on the peer, it can happen that the
local request later still completes (hopefully with an error).
As it may already have been completed to upper layers meanwhile,
it must not be retried again now.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-06-12 14:34:51 +02:00
Lars Ellenberg 4eccc57979 drbd: fix access of unallocated pages and kernel panic
BUG: unable to handle kernel NULL pointer dereference at (null)
...
 [<d1e17561>] ? _drbd_bm_set_bits+0x151/0x240 [drbd]
 [<d1e236f8>] ? receive_bitmap+0x4f8/0xbc0 [drbd]

This fixes an off-by-one error in the receive_bitmap() path,
if run-length encoded bitmap transfer is enabled.

If the bitmap is an exact multiple of PAGE_SIZE, which means the visible
capacity of the drbd device is an exact multiple of 128 MiB (for 4k page
size), and bitmap compression (use-rle) is enabled (which became default
with 8.4), and the very last bit is dirty and reported in an rle
comressed bitmap packet, we ended up trying to kmap_atomic a page pointer
that does not exist (bitmap->bm_pages[last index + 1]).

bug introduced by:
    Date:   Fri Jul 24 15:33:24 2009 +0200
    set bits: optimize for complete last word, fix off-by-one-word corner case

made effective by:
    Date:   Thu Dec 16 00:32:38 2010 +0100
    drbd: get rid of unused debug code

    Long time ago, we had paranoia code in the bitmap that allocated one
    extra word, assigned a magic value, and checked on every occasion that
    the magic value was still unchanged.

    That debug code is unused, the extra long word complicates code a bit.
    Get rid of it.

No-one triggered this bug in the last few years, because a large subset
of our userbase is unaffected:
 * typically the last few blocks of a device are not modified
   frequently, and remain unset
 * use-rle was disabled by default in drbd < 8.4
 * those with slightly "odd" device sizes, or
 * drbd internal meta data (which will skew the device size slightly,
   thus makes it harder to have a bug relevant device size)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-06-12 14:32:48 +02:00
Konrad Rzeszutek Wilk 6878c32e5c xen/blkfront: Add WARN to deal with misbehaving backends.
Part of the ring structure is the 'id' field which is under
control of the frontend. The frontend stamps it with "some"
value (this some in this implementation being a value less
than BLK_RING_SIZE), and when it gets a response expects
said value to be in the response structure. We have a check
for the id field when spolling new requests but not when
de-spolling responses.

We also add an extra check in add_id_to_freelist to make
sure that the 'struct request' was not NULL - as we cannot
pass a NULL to __blk_end_request_all, otherwise that crashes
(and all the operations that the response is dealing with
end up with __blk_end_request_all).

Lastly we also print the name of the operation that failed.

[v1: s/BUG/WARN/ suggested by Stefano]
[v2: Add extra check in add_id_to_freelist]
[v3: Redid op_name per Jan's suggestion]
[v4: add const * and add WARN on failure returns]
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-06-12 08:29:04 -04:00
Dan Carpenter 895cfcc810 rbd: endian bug in rbd_req_cb()
Sparse complains about this because:
drivers/block/rbd.c:996:20: warning: cast to restricted __le32
drivers/block/rbd.c:996:20: warning: cast from restricted __le16

These are set in osd_req_encode_op() and they are le16.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2012-06-06 09:23:54 -05:00
Yan, Zheng f9f9a19044 rbd: Fix ceph_snap_context size calculation
ceph_snap_context->snaps is an u64 array

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2012-06-06 09:23:53 -05:00
Asai Thambi S P 7b421d24ea mtip32xx: Create debugfs entries for troubleshooting
On module load, creates a debugfs parent 'rssd' in debugfs root. Then for each
device, create a new node with corresponding disk name. Under the new node, two
entries 'registers' and 'flags' are created.

NOTE: These entries were removed from sysfs in the previous patch

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-05 09:13:49 +02:00
Asai Thambi S P 7412ff139d mtip32xx: Remove 'registers' and 'flags' from sysfs
This patch removes entries 'registers' and 'flags' from sysfs. Updated ABI file
to reflect this change.

Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-05 09:13:48 +02:00
Sachin Kamat 87c9ea76a2 mtip32xx: Remove version.h header file inclusion
version.h header file inclusion is no longer required.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
2012-06-04 10:00:32 +02:00
Asai Thambi S P b77874c969 mtip32xx: Changes to sysfs entries
* Formatted the output of 'registers' entry
* Added "Commands in Q' to output of 'registers' entry
* Added a new entry 'flags'

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:46:50 +02:00
Asai Thambi S P 8ce800935d mtip32xx: Convert macro definitions for flag bits to enum
Convert macro definitions for flags bits to enum

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:46:50 +02:00
Asai Thambi S P 377b8fc6d7 mtip32xx: minor performance tweak
When checking for command completions if the register value is zero, proceed
to next register.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:46:50 +02:00
Asai Thambi S P e602878fd8 mtip32xx: Fix to support more than one sector in exec_drive_command()
Fix to support more than one sector in exec_drive_command().

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:46:50 +02:00
Asai Thambi S P 0a07ab224a mtip32xx: Use plain spinlock for 'cmd_issue_lock'
'cmd_issue_lock' is for only acquiring a free slot, and it is not used
in interrupt context. So replaced irq version with non-irq version of spinlock.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:46:50 +02:00
Asai Thambi S P 6c8ab69818 mtip32xx: Set block queue boundary variables
Set the following block queue boundary variables
	* max_hw_sectors
	* max_segment_size

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>

Removed setting of q->nr_requests.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:46:50 +02:00
Asai Thambi S P d02e1f0ad0 mtip32xx: Fix to handle TFE for PIO(IOCTL/internal) commands
If a PIO (IOCTL/internal) command resulted in TFE, signal the wait event or break out of polling.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:36:55 +02:00
Asai Thambi S P 971890f258 mtip32xx: Change HDIO_GET_IDENTITY to return stored data
For the ioctl command HDIO_GET_IDENTITY, return the stored copy of IDENTIFY
DATA instead of sending the command to the device - similar to libata.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:36:55 +02:00
Asai Thambi S P 2df7aa96e7 mtip32xx: Set custom timeouts for PIO commands
This change sets custom timeouts depending on PIO command.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:36:55 +02:00
Asai Thambi S P 6bb688c048 mtip32xx: fix clearing an incorrect register in mtip_init_port
Fix clearing an incorrect register in mtip_init_port

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-31 08:36:55 +02:00
Konrad Rzeszutek Wilk 8c9ce606a6 xen/blkback: Copy id field when doing BLKIF_DISCARD.
We weren't copying the id field so when we sent the response
back to the frontend (especially with a 64-bit host and 32-bit
guest), we ended up using a random value. This lead to the
frontend crashing as it would try to pass to __blk_end_request_all
a NULL 'struct request' (b/c it would use the 'id' to find the
proper 'struct request' in its shadow array) and end up crashing:

BUG: unable to handle kernel NULL pointer dereference at 000000e4
IP: [<c0646d4c>] __blk_end_request_all+0xc/0x40
.. snip..
EIP is at __blk_end_request_all+0xc/0x40
.. snip..
 [<ed95db72>] blkif_interrupt+0x172/0x330 [xen_blkfront]

This fixes the bug by passing in the proper id for the response.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=824641

CC: stable@kernel.org
Tested-by: William Dauchy <wdauchy@gmail.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-30 17:20:04 -04:00
Linus Torvalds af56e0aa35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph updates from Sage Weil:
 "There are some updates and cleanups to the CRUSH placement code, a bug
  fix with incremental maps, several cleanups and fixes from Josh Durgin
  in the RBD block device code, a series of cleanups and bug fixes from
  Alex Elder in the messenger code, and some miscellaneous bounds
  checking and gfp cleanups/fixes."

Fix up trivial conflicts in net/ceph/{messenger.c,osdmap.c} due to the
networking people preferring "unsigned int" over just "unsigned".

* git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (45 commits)
  libceph: fix pg_temp updates
  libceph: avoid unregistering osd request when not registered
  ceph: add auth buf in prepare_write_connect()
  ceph: rename prepare_connect_authorizer()
  ceph: return pointer from prepare_connect_authorizer()
  ceph: use info returned by get_authorizer
  ceph: have get_authorizer methods return pointers
  ceph: ensure auth ops are defined before use
  ceph: messenger: reduce args to create_authorizer
  ceph: define ceph_auth_handshake type
  ceph: messenger: check return from get_authorizer
  ceph: messenger: rework prepare_connect_authorizer()
  ceph: messenger: check prepare_write_connect() result
  ceph: don't set WRITE_PENDING too early
  ceph: drop msgr argument from prepare_write_connect()
  ceph: messenger: send banner in process_connect()
  ceph: messenger: reset connection kvec caller
  libceph: don't reset kvec in prepare_write_banner()
  ceph: ignore preferred_osd field
  ceph: fully initialize new layout
  ...
2012-05-30 11:17:19 -07:00
Linus Torvalds a70f35af4e Merge branch 'for-3.5/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "Here are the driver related changes for 3.5.  It contains:

   - The floppy changes from Jiri.  Jiri is now also marked as the
     maintainer of floppy.c, I shall be publically branding his forehead
     with red hot iron at the next opportune moment.

   - A batch of drbd updates and fixes from the linbit crew, as well as
     fixes from others.

   - Two small fixes for xen-blkfront courtesy of Jan."

* 'for-3.5/drivers' of git://git.kernel.dk/linux-block: (70 commits)
  floppy: take over maintainership
  floppy: remove floppy-specific O_EXCL handling
  floppy: convert to delayed work and single-thread wq
  xen-blkfront: module exit handling adjustments
  xen-blkfront: properly name all devices
  drbd: grammar fix in log message
  drbd: check MODULE for THIS_MODULE
  drbd: Restore the request restart logic
  drbd: introduce a bio_set to allocate housekeeping bios from
  drbd: remove unused define
  drbd: bm_page_async_io: properly initialize page->private
  drbd: use the newly introduced page pool for bitmap IO
  drbd: add page pool to be used for meta data IO
  drbd: allow bitmap to change during writeout from resync_finished
  drbd: fix race between drbdadm invalidate/verify and finishing resync
  drbd: fix resend/resubmit of frozen IO
  drbd: Ensure that data_size is not 0 before using data_size-1 as index
  drbd: Delay/reject other state changes while establishing a connection
  drbd: move put_ldev from __req_mod() to the endio callback
  drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE
  ...
2012-05-30 09:05:47 -07:00
Linus Torvalds 99262a3daf Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPuv35AAoJENkgDmzRrbjxUx4P/0uc+0oNnZv11vYQsqHuhURa
 zMlsVdlXGVkvPqQiLY0QkrK5LcO6KiSnSk8vEnOYFIPjL4wNqL/4RRRLnTAJwmE+
 lsrL9DblI8Ira/EZRv7d2L12QrP+F2ZGKOZr67uVxSaxH71fUqtiJ0jqA/I8AYH7
 /V7+DgdIB1DD28Ya/JEFEUi41F08A6MU10hpaQWy9kXv09gCc9apgvH7/S3s9DaQ
 G640YWkoKZAx/OFBb8XFvpu9LqZcVl02Nl8goMZOKnMctC4iU3km7HeVjfwCgLjO
 AdA5spLMhDkS/xrpI0mSQ/wT0k0+sSYW5vEdW9N4XLZza0NgH9GfU4RtEuK85Slj
 7bPviZOcpjtt0sGi4wXCaVjZyHROX6tyRvTMUAIj3D0oJglb5T9D3MCvQnadILb0
 I0+7gk3d9rHqkO6CmjNaZG9IwR9NpFkbuolcFQuEaZoUMoKd2pYNQyxpbFGl+jCl
 7ViFHAy+fydNqDoETKincld4A43KWxOV7jyEJd7hloKcCixsqI7ZdPS7X8amec72
 a0hfNgMJzarZkTgo61Hair/d+vKGRJPgEdF1Yq76SDhYKD1TeWeDjmboctsiMjqe
 f5M4C6IdNJj9cDIlCxMk+3bX250oy7KG77v7Ux0/7nvtSWVa3yEMowD57hnn1But
 0gNC8bjXDHRsho90rDRN
 =Kj9v
 -----END PGP SIGNATURE-----

Merge tag 'virtio-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus

Pull virtio updates from Rusty Russell.

* tag 'virtio-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio: fix typo in comment
  virtio-mmio: Devices parameter parsing
  virtio_blk: Drop unused request tracking list
  virtio-blk: Fix hot-unplug race in remove method
  virtio: Use ida to allocate virtio index
  virtio: balloon: separate out common code between remove and freeze functions
  virtio: balloon: drop restore_common()
  9p: disconnect channel when PCI device is removed
  virtio: update documentation to v0.9.5 of spec
2012-05-21 20:20:23 -07:00
Asias He f65ca1dc6a virtio_blk: Drop unused request tracking list
Benchmark shows small performance improvement on fusion io device.

Before:
  seq-read : io=1,024MB, bw=19,982KB/s, iops=39,964, runt= 52475msec
  seq-write: io=1,024MB, bw=20,321KB/s, iops=40,641, runt= 51601msec
  rnd-read : io=1,024MB, bw=15,404KB/s, iops=30,808, runt= 68070msec
  rnd-write: io=1,024MB, bw=14,776KB/s, iops=29,552, runt= 70963msec

After:
  seq-read : io=1,024MB, bw=20,343KB/s, iops=40,685, runt= 51546msec
  seq-write: io=1,024MB, bw=20,803KB/s, iops=41,606, runt= 50404msec
  rnd-read : io=1,024MB, bw=16,221KB/s, iops=32,442, runt= 64642msec
  rnd-write: io=1,024MB, bw=15,199KB/s, iops=30,397, runt= 68991msec

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-05-22 12:16:14 +09:30
Asias He b79d866c8b virtio-blk: Fix hot-unplug race in remove method
If we reset the virtio-blk device before the requests already dispatched
to the virtio-blk driver from the block layer are finised, we will stuck
in blk_cleanup_queue() and the remove will fail.

blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued
before DEAD marking. However it will never success if the device is
already stopped. We'll have q->in_flight[] > 0, so the drain will not
finish.

How to reproduce the race:
1. hot-plug a virtio-blk device
2. keep reading/writing the device in guest
3. hot-unplug while the device is busy serving I/O

Test:
~1000 rounds of hot-plug/hot-unplug test passed with this patch.

Changes in v3:
- Drop blk_abort_queue and blk_abort_request
- Use __blk_end_request_all to complete request dispatched to driver

Changes in v2:
- Drop req_in_flight
- Use virtqueue_detach_unused_buf to get request dispatched to driver

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-05-22 12:16:13 +09:30
David S. Miller 17eea0df5f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-05-20 21:53:04 -04:00
Linus Torvalds 14e931a264 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "A few small, but important fixes.  Most of them are marked for stable
  as well

   - Fix failure to release a semaphore on error path in mtip32xx.
   - Fix crashable condition in bio_get_nr_vecs().
   - Don't mark end-of-disk buffers as mapped, limit it to i_size.
   - Fix for build problem with CONFIG_BLOCK=n on arm at least.
   - Fix for a buffer overlow on UUID partition printing.
   - Trivial removal of unused variables in dac960."

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix buffer overflow when printing partition UUIDs
  Fix blkdev.h build errors when BLOCK=n
  bio allocation failure due to bio_get_nr_vecs()
  block: don't mark buffers beyond end of disk as mapped
  mtip32xx: release the semaphore on an error path
  dac960: Remove unused variables from DAC960_CreateProcEntries()
2012-05-19 10:12:17 -07:00
Jens Axboe 4fd1ffaa12 Merge branch 'for-jens' of git://git.drbd.org/linux-drbd into for-3.5/drivers
Philipp writes:

This are the updates we have in the drbd-8.3 tree. They are intended
for your "for-3.5/drivers" drivers branch.

These changes include one new feature:
 * Allow detach from frozen backing devices with the new --force option;
   configurable timeout for backing devices by the new disk-timeout option

And huge number of bug fixes:
 * Fixed a write ordering problem on SyncTarget nodes for a write
   to a block that gets resynced at the same time. The bug can
   only be triggered with a device that has a firmware that
   actually reorders writes to the same block
 * Fixed a race between disconnect and receive_state, that could cause
   a IO lockup
 * Fixed resend/resubmit for requests with disk or network timeout
 * Make sure that hard state changed do not disturb the connection
   establishing process (I.e. detach due to an IO error). When the
   bug was triggered it caused a retry in the connect process
 * Postpone soft state changes to no disturb the connection
   establishing process (I.e. becoming primary). When the bug
   was triggered it could cause both nodes going into SyncSource state
 * Fixed a refcount leak that could cause failures when trying to
   unload a protocol family modules, that was used by DRBD
 * Dedicated page pool for meta data IOs
 * Deny normal detach (as opposed to --forced) if the user tries
   to detach from the last UpToDate disk in the resource
 * Fixed a possible protocol error that could be caused by
   "unusual" BIOs.
 * Enforce the disk-timeout option also on meta-data IO operations
 * Implemented stable bitmap pages when we do a full write out of
   the bitmap
 * Fixed a rare compatibility issue with DRBD's older than 8.3.7
   when negotiating the bio_size
 * Fixed a rare race condition where an empty resync could stall with
   if pause/unpause events happen in parallel
 * Made the re-establishing of connections quicker, if it got a broken pipe
   once. Previously there was a bug in the code caused it to waste the first
   successful established connection after a broken pipe event.

PS: I am postponing the drbd-8.4 for mainline for one or two kernel
    development cycles more (the ~400 patchets set).
2012-05-18 16:20:06 +02:00
Jens Axboe 13828dec45 Merge branch 'stable/for-jens-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.5/drivers
Konrad writes:

Please git pull the following branch:

 git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git stable/for-jens-3.5

in your for-3.5/drivers branch. The changes in it are rather simple - cleaning
up some code and adding proper mechanism to unload without leaking memory.
2012-05-18 16:17:41 +02:00
Jiri Kosina bfa10b8c98 floppy: remove floppy-specific O_EXCL handling
Block layer now handles O_EXCL in a generic way for block devices.

The semantics is however different for floppy and all other block devices,
as floppy driver contains its own O_EXCL handling.

The semantics for all-but-floppy bdevs is "there can be at most one O_EXCL
open of this file", while for floppy bdev the semantics is "if someone has
the bdev open with O_EXCL, noone else can open it".

There is actual userspace-observable change in behavior because of this
since commit e525fd89d3 ("block: make blkdev_get/put() handle exclusive
access") -- on kernels containing this commit, mount of /dev/fd0 causes
the fd0 block device be claimed with _EXCL, preventing subsequent
open(/dev/fd0).

Bring things back into shape, i.e.  make it possible, analogically to
other block devices, to mount the floppy and open() it afterwards --
remove the floppy-specific handling and let the generic bdev code O_EXCL
handling take over.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-05-18 15:19:11 +02:00
Jiri Kosina 070ad7e793 floppy: convert to delayed work and single-thread wq
There are several races in floppy driver between bottom half
(scheduled_work) and timers (fd_timeout, fd_timer). Due to slowness
of the actual floppy devices, those races are never (at least to my
knowledge) triggered on a bare floppy metal. However on virtualized
(emulated) floppy drives, which are of course magnitudes faster
than the real ones, these races trigger reliably. They usually exhibit
themselves as NULL pointer dereferences during DMA setup, such as

	BUG: unable to handle kernel NULL pointer dereference at 0000000a
	[ ... snip ... ]
	EIP: 0060:[<c02053d5>] EFLAGS: 00010293 CPU: 0
	EAX: ffffe000 EBX: 0000000a ECX: 00000000 EDX: 0000000a
	ESI: c05d2718 EDI: 00000000 EBP: 00000000 ESP: f540fe44
	 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
	Process swapper (pid: 0, ti=f540e000 task=c082d5a0 task.ti=c0826000)
	Stack:
	 ffffe000 00001ffc 00000000 00000000 00000000 c05d2718 c0708b40 f540fe80
	 c020470f c05d2718 c0708b40 00000000 f540fe80 0000000a f540fee4 00000000
	 c0708b40 f540fee4 00000000 00000000 c020526b 00000000 c05d2718 c0708b40
	Call Trace:
	 [<c020470f>] dump_trace+0xaf/0x110
	 [<c020526b>] show_trace_log_lvl+0x4b/0x60
	 [<c0205298>] show_trace+0x18/0x20
	 [<c05c5811>] dump_stack+0x6d/0x72
	 [<c0248527>] warn_slowpath_common+0x77/0xb0
	 [<c02485f3>] warn_slowpath_fmt+0x33/0x40
	 [<f7ec593c>] setup_DMA+0x14c/0x210 [floppy]
	 [<f7ecaa95>] setup_rw_floppy+0x105/0x190 [floppy]
	 [<c0256d08>] run_timer_softirq+0x168/0x2a0
	 [<c024e762>] __do_softirq+0xc2/0x1c0
	 [<c02042ed>] do_softirq+0x7d/0xb0
	 [<f54d8a00>] 0xf54d89ff

but other instances can be easily seen as well. This can be observed at least under
VMWare, VirtualBox and KVM.

This patch converts all the timers and bottom halfs to be processed in a single
workqueue. This aproach has been already discussed back in 2010 if I remember
correctly, and Acked by Linus [1], but it then never made it to the tree.

This all is based on original idea and code of Stephen Hemminger.  I have
ported original Stepen's code to the current state of the floppy driver, and
performed quite some testing (on real hardware), which didn't reveal any issues
(this includes not only writing and reading data, but also formatting
(unfortunately I didn't find any Double-Density disks any more)). Ability to
handle errors properly (supplying known bad floppies) has also been verified.

[1] http://kerneltrap.org/mailarchive/linux-kernel/2010/6/11/4582092

Based-on-patch-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-05-18 15:19:10 +02:00
David S. Miller 028940342a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-05-16 22:17:37 -04:00
Josh Durgin 263c6ca007 rbd: rename __rbd_update_snaps to __rbd_refresh_header
This function rereads the entire header and handles any changes in
it, not just changes in snapshots.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2012-05-14 12:13:09 -05:00
Josh Durgin 3591538fb2 rbd: fix snapshot size type
Snapshot sizes should be the same type as regular image sizes. This
only affects their displayed size in sysfs, not the reported size of
an actual block device sizes.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2012-05-14 12:13:03 -05:00
Josh Durgin b06e6a6be7 rbd: remove conditional snapid parameters
The snapid parameters passed to rbd_do_op() and rbd_req_sync_op()
are now always either a valid snapid or an explicit CEPH_NOSNAP.

[elder@dreamhost.com: Rephrased the description]

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2012-05-14 12:12:58 -05:00
Josh Durgin 77dfe99fe3 rbd: store snapshot id instead of index
When a device was open at a snapshot, and snapshots were deleted or
added, data from the wrong snapshot could be read. Instead of
assuming the snap context is constant, store the actual snap id when
the device is initialized, and rely on the OSDs to signal an error
if we try reading from a snapshot that was deleted.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2012-05-14 12:12:52 -05:00
Josh Durgin 403f24d3d5 rbd: protect read of snapshot sequence number
This is updated whenever a snapshot is added or deleted, and the
snapc pointer is changed with every refresh of the header.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2012-05-14 12:12:46 -05:00
Xi Wang 50f7c4c967 rbd: fix integer overflow in rbd_header_from_disk()
ondisk->snap_count is read from disk via rbd_req_sync_read() and thus
needs validation.  Otherwise, a bogus `snap_count' could overflow the
kmalloc() size, leading to memory corruption.

Also use `u32' consistently for `snap_count'.

[elder@dreamhost.com: changed to use UINT_MAX rather than ULONG_MAX]

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
2012-05-14 12:12:41 -05:00
Dan Carpenter f8ad495a8a rbd: use gfp_flags parameter in rbd_header_from_disk()
We should use the gfp_flags that the caller specified instead of
GFP_KERNEL here.

There is only one caller and it uses GFP_KERNEL, so this change is
just a cleanup and doesn't change how the code works.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
2012-05-14 12:12:35 -05:00
Jan Beulich 8605067fb9 xen-blkfront: module exit handling adjustments
The blkdev major must be released upon exit, or else the module can't
attach to devices using the same majors upon being loaded again. Also
avoid leaking the minor tracking bitmap.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-11 16:11:54 -04:00
Jan Beulich e77c78c022 xen-blkfront: properly name all devices
- devices beyond xvdzz didn't get proper names assigned at all
- extended devices with minors not representable within the kernel's
  major/minor bit split spilled into foreign majors

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-11 16:11:52 -04:00
Asai Thambi S P a09ba13eef mtip32xx: release the semaphore on an error path
Release the semaphore in an error path in mtip_hw_get_scatterlist(). This
fixes the smatch warning inconsistent returns.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-11 16:42:14 +02:00
Jesper Juhl d88a440edd dac960: Remove unused variables from DAC960_CreateProcEntries()
The variables 'StatusProcEntry' and 'UserCommandProcEntry' are
assigned to once and then never used. This patch gets rid of the
variables.

While I was there I also fixed the indentation of the function to use
tabs rather than spaces for the lines that did not already do so.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-05-11 16:42:14 +02:00
Eric W. Biederman 38bf195398 connector/userns: replace netlink uses of cap_raised() with capable()
In 2009 Philip Reiser notied that a few users of netlink connector
interface needed a capability check and added the idiom
cap_raised(nsp->eff_cap, CAP_SYS_ADMIN) to a few of them, on the premise
that netlink was asynchronous.

In 2011 Patrick McHardy noticed we were being silly because netlink is
synchronous and removed eff_cap from the netlink_skb_params and changed
the idiom to cap_raised(current_cap(), CAP_SYS_ADMIN).

Looking at those spots with a fresh eye we should be calling
capable(CAP_SYS_ADMIN).  The only reason I can see for not calling capable
is that it once appeared we were not in the same task as the caller which
would have made calling capable() impossible.

In the initial user_namespace the only difference between between
cap_raised(current_cap(), CAP_SYS_ADMIN) and capable(CAP_SYS_ADMIN) are a
few sanity checks and the fact that capable(CAP_SYS_ADMIN) sets
PF_SUPERPRIV if we use the capability.

Since we are going to be using root privilege setting PF_SUPERPRIV seems
the right thing to do.

The motivation for this that patch is that in a child user namespace
cap_raised(current_cap(),...) tests your capabilities with respect to that
child user namespace not capabilities in the initial user namespace and
thus will allow processes that should be unprivielged to use the kernel
services that are only protected with cap_raised(current_cap(),..).

To fix possible user_namespace issues and to just clean up the code
replace cap_raised(current_cap(), CAP_SYS_ADMIN) with
capable(CAP_SYS_ADMIN).

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-10 23:21:39 -04:00
Lars Ellenberg 92b4ca291f drbd: grammar fix in log message
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-10 12:00:56 +02:00
Cong Wang bc4854bc91 drbd: check MODULE for THIS_MODULE
THIS_MODULE is NULL only when drbd is compiled as built-in,
so the #ifdef CONFIG_MODULES should be #ifdef MODULE instead.

This fixes the warning:

drivers/block/drbd/drbd_main.c: In function ‘drbd_buildtag’:
drivers/block/drbd/drbd_main.c:4187:24: warning: the comparison will always evaluate as ‘true’ for the address of ‘__this_module’ will never be NULL [-Waddress]

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-10 12:00:54 +02:00
Philipp Reisner f6d0a8dbfd drbd: Restore the request restart logic
It got lost with the commit 5a7bbad27a
"block: remove support for bio remapping from ->make_request"

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 17:20:59 +02:00
Lars Ellenberg 9476f39d66 drbd: introduce a bio_set to allocate housekeeping bios from
Don't rely on availability of bios from the global fs_bio_set,
we should use our own bio_set for meta data IO.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:07 +02:00
Lars Ellenberg 3c2f7a856f drbd: remove unused define
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:06 +02:00
Arne Redlich 0c7db27920 drbd: bm_page_async_io: properly initialize page->private
If bm_page_async_io is advised to use a new page for I/O
(BM_AIO_COPY_PAGES is set), it will get it from a mempool.
Once the mempool has to dip into its reserves the page is
not reinitialized, i.e. page->private contains garbage, which
will lead to various problems once the I/O completes (dereferences
of NULL pointers, the submitting thread getting stuck in D-state,
 ...).

Signed-off-by: Arne Redlich <arne.redlich@googlemail.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2012-05-09 15:17:04 +02:00
Lars Ellenberg 4d95a10f97 drbd: use the newly introduced page pool for bitmap IO
Conflicts:

	drbd/drbd_bitmap.c

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:03 +02:00
Lars Ellenberg 4281808fb3 drbd: add page pool to be used for meta data IO
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:02 +02:00
Lars Ellenberg 0e8488ade2 drbd: allow bitmap to change during writeout from resync_finished
Symptom: messages similar to
 "FIXME asender in bm_change_bits_to,
  bitmap locked for 'write from resync_finished' by worker"

If a resync or verify is finished (or aborted), a full bitmap writeout
is triggered.  If we have ongoing local IO, the bitmap may still change
during that writeout, pending and not yet processed acks may cause bits
to be cleared, while new writes may cause bits to be to be set.

To fix this, introduce the drbd_bm_write_copy_pages() variant.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:17:00 +02:00
Lars Ellenberg a574daf5d7 drbd: fix race between drbdadm invalidate/verify and finishing resync
When a resync or online verify is finished or aborted,
drbd does a bulk write-out of changed bitmap pages.

If *in that very moment* a new verify or resync is triggered,
this can race:
 ASSERT( !test_bit(BITMAP_IO, &mdev->flags) ) in drbd_main.c
 FIXME going to queue 'set_n_write from StartingSync' but 'write from resync_finished' still pending?
and similar.

This can be observed with e.g. tight invalidate loops in test scripts,
and probably has no real-life implication.

Still, that race can be solved by first quiescen the device,
before starting a new resync or verify.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:59 +02:00
Lars Ellenberg ba280c092e drbd: fix resend/resubmit of frozen IO
DRBD can freeze IO, due to fencing policy (fencing resource-and-stonith),
or because we lost access to data (on-no-data-accessible suspend-io).

Resuming from there (re-connect, or re-attach, or explicit admin
intervention) should "just work".

Unfortunately, if the re-attach/re-connect did not happen within
the timeout, since the commit
  drbd: Implemented real timeout checking for request processing time
if so configured, the request_timer_fn() would timeout and
detach/disconnect virtually immediately.

This change tracks the most recent attach and connect, and does not
timeout within <configured timeout interval> after attach/connect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:58 +02:00
Philipp Reisner 5de738272e drbd: Ensure that data_size is not 0 before using data_size-1 as index
This could be exploited by a peer which runs modified code.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:56 +02:00
Philipp Reisner 197296ffed drbd: Delay/reject other state changes while establishing a connection
Changes to the role and disk state should be delayed or rejected
while we establish a connection.

This is necessary, since the peer will base its resync decision
on the UUIDs and the state we sent in the drbd_connect() function.

The most prominent example for this race is becoming primary after
sending state and UUIDs and before the state changes to C_WF_CONNECTION.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:55 +02:00
Lars Ellenberg 46385c84ac drbd: move put_ldev from __req_mod() to the endio callback
One invocation in the endio handler is good enough,
we don't need mention it for each of the different ways
it calls __req_mod().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:51 +02:00
Lars Ellenberg d64957c9a9 drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE
Just because this request happened during a resync does
not mean it may pretend to have been barrier-acked.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:50 +02:00
Lars Ellenberg 41c4a0035b drbd: fix READ_RETRY_REMOTE_CANCELED to not complete if device is suspended
READ_RETRY_REMOTE_CANCELED needs to be grouped with the other _CANCELED
cases, not with CONNECTION_LOST_WHILE_PENDING, as that would complete
(fail) the bio even if the device became suspended.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:48 +02:00
Lars Ellenberg 6d49e101fd drbd: make OOS_HANDED_TO_NETWORK its own case
OOS_HANDED_TO_NETWORK should not be grouped with the various
*_CANCELED/*_FAILED cases.
Also, not only clear the RQ_NET_QUEUED flag, but also mark it RQ_NET_DONE,
so it can be distinguished from a local-only request even after that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:47 +02:00
Lars Ellenberg c088b2d904 drbd: don't pretend that barrier_nr == 0 was special
We used to have a barrier implementation where barrier_nr 0 was
reserved. That is long gone. Just use the full sequence space.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:46 +02:00
Lars Ellenberg 7ffcaa7194 drbd: remove unused static helper function
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:44 +02:00
Lars Ellenberg a5d214f621 drbd: remove some very outdated comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:43 +02:00
Lars Ellenberg 1abc2af205 drbd: missing wakeup after drbd_rs_del_all
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:42 +02:00
Lars Ellenberg 671a74e749 drbd: remove now unused seq_num member from struct drbd_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:40 +02:00
Lars Ellenberg 001a88687a drbd: fix potential data corruption and protocol error
We assumed only bios with bi_idx == 0 would end up
in drbd_make_request().

That is wrong.

At least device mapper, in __clone_and_map(), may submit
clones only covering a partial bio, but sharing
the original bvec, by adjusting bi_idx and relevant
other bio members of the clone.

We used __bio_for_each_segment() in various places,
even though that is documented as
 * drivers should not use the __ version unless they _really_ want to
 * run through the entire bio and not just pending pieces

Impact: we would send the full bio bvec, even for the clone
with bi_idx > 0, which will cause data corruption on the
peer (because we submit wrong data at the clone offset),
and will cause a DRBD protocol error, disconnect/reconnect
and resync (thus fixing the corruption),
because the next package header would be expected right
in the middle of the sent data, causing DRBD magic mismatch.

Fix: drop the assert, and use bio_for_each_segment()
instead of the __ version.

Conflicts:

	drbd/drbd_tracing.c

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:39 +02:00
Philipp Reisner b6a370ba07 drbd: Fix a potential write ordering issue on SyncTarget nodes
If a SyncTarget node gets a P_RS_DATA_REPLY before a P_DATA packet
for the same sector, it simply submits these two IO requests.

  This is be possible because on the SyncSource node, the data of the
  P_RS_DATA_REPLY packet was read from disk.  Immediately after that a
  write request from upper layers came in.

The disk scheduler or even the "hardware" queues on the disk drive might
reorder these writes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:38 +02:00
Philipp Reisner fc28845bc0 drbd: Fix a potential race that could case data inconsistency
When we have a write request and a state change C_WF_BITMAP_S -> C_SYNC_SOURCE
at the same time, and it happens that the line

	remote = remote && drbd_should_do_remote(s);

stills sees C_WF_BITMAP_S, and

	send_oos = rw == WRITE && drbd_should_send_oos(s);

already sees C_SYNC_SOURCE both are 0.

This causes the write to not be mirrored, but marked as out-of-sync on the
Sync_Source node.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:34 +02:00
Lars Ellenberg 031a7c173f drbd: add missing part_round_stats to _drbd_start_io_acct
Without this, iostat frequently sees bogus svctime and >= 100% "utilization".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:33 +02:00
Lars Ellenberg 47a4f1c1bb drbd: Fix module refcount leak in drbd_accept()
drbd_accept was modelled after kernel_accept
with drbd commit 53eb779 in July 2008.

Only, kernel_accept was then broken, and only fixed later
with kernel commit 1b08534e in Dec 2008:
net: Fix module refcount leak in kernel_accept()

Impact: protocol families provided as modules, e.g. ipv6 or ib_sdp,
would soon have their reference count become negative, preventing
them from being unloaded (likely), or worse, hit zero without actually
being unused, allowing them to be unloaded while still in use (unlikely,
but if triggered, causing a kernel crash).

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:32 +02:00
Philipp Reisner 7caacb69ac drbd: Consider the disk-timeout also for meta-data IO operations
If the backing device is already frozen during attach, we failed
to recognize that. The current disk-timeout code works on top
of the drbd_request objects. During attach we do not allow IO
and therefore never generate a drbd_request object but block
before that in drbd_make_request().

This patch adds the timeout to all drbd_md_sync_page_io().

Before this patch we used to go from D_ATTACHING directly
to D_DISKLESS if IO failed during attach. We can no longer
do this since we have to stay in D_FAILED until all IO
ops issued to the backing device returned.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:30 +02:00
Philipp Reisner 4afc433cf8 drbd: Do not send state packets while lower than C_CONNECTED cstate
I.e. in C_WF_REPORT_PARAMS or in C_WF_CONNECTION.
Sending may already work in these cstates, but the peer still expects
the HandShake / ConnectionFeatures packet.

Actually triggered by the Testuite on kugel.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:29 +02:00
Lars Ellenberg 545752d5d8 drbd: fix race between disconnect and receive_state
If the asender thread, or request_timer_fn(), or some other part of
the code, decided to drop the connection (because of timeout or other),
but the receiver just now was processing a P_STATE packet, there was a
chance that receive_state() would do a hard state change
"re-establishing" an already failed connection without additional handshake.

Log excerpt:
  Remote failed to finish a request within ko-count * timeout
  peer( Secondary -> Unknown ) conn( Connected -> Timeout ) pdsk( UpToDate -> DUnknown )
  asender terminated
  ...
  peer( Unknown -> Secondary ) conn( Timeout -> Connected ) pdsk( DUnknown -> UpToDate ) peer_isp( 0 -> 1 )
  ...
  Connection closed
  peer( Secondary -> Unknown ) conn( Connected -> Unconnected ) pdsk( UpToDate -> DUnknown ) peer_isp( 1 -> 0 )
  receiver terminated

Impact:
while the connection state is erroneously "Connected",
requests may be queued and even sent,
which would never be acknowledged,
and may have been missed by the cleanup.
These requests would never be completed.

The next drbd_suspend_io() will then lock up,
waiting forever for these requests to complete.

Fixed in several code paths:
  Make sure the connection state is NetworkFailure or worse
  before starting the cleanup in drbd_disconnect().
  This should make sure the cleanup won't miss any requests.

  Disallow receive_state() to "upgrade" the connection state
  from an error state. This will make sure the "illegal" state
  transition won't happen.

  For all connection failure states,
  relax the safe-guard in sanitize_state() again
  to silently mask out those state changes
  (e.g. Timeout -> Connected becomes Timeout -> Timeout).

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:01 +02:00
Lars Ellenberg 763eb63625 drbd: fix potential spinlock deadlock
drbd_try_clear_on_disk_bm() has a sanity check for the number of blocks
left to be resynced (rs_left) in the current resync extent.
If it detects a mismatch, it complains, and forces a disconnect using
drbd_force_state(mdev, NS(conn, C_DISCONNECTING));

Unfortunately, this may be called while holding the req_lock,
and drbd_force_state() want's to aquire that lock itself. Deadlock.

Don't force a disconnect, but fix up rs_left by recounting and
reassigning the number of dirty blocks in that extent.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:58 +02:00
Philipp Reisner e89868a092 drbd: Fixed an obvious copy-n-paste mistake
This bug might have caused troubles if disk-barriers and the ahead-behind
more are enabled at the same time.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:57 +02:00
Lars Ellenberg f479ea0661 drbd: send intermediate state change results to the peer
DRBD state changes schedule after_state_ch() actions to a worker thread,
which decides on the old and new states of that change, whether to send
an informational state update packet (P_STATE) to the peer.
If it decides to drbd_send_state(), it would however always send the
_curent_ state, which, if a second state change happens before the
after_state_ch() of the first ran, may "fast-forward" the peer's view
about this node.  In most cases that is harmless, but sometimes this can
confuse DRBD, for example into not actually starting a necessary resync
if you do a very tight detach/attach loop on a Connected Secondary.

Fix this by always sending the "new" state of the respective state
transition which scheduled this after_state_ch() work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:56 +02:00
Lars Ellenberg a2e9138197 drbd: fix spurious meta data IO "error"
When detaching, even cleanly detaching due to administrator request,
we always go through D_FAILED before we become D_DISKLESS.

Don't let that state change race with an in-flight meta data IO,
or that one might think it actually experienced an IO error.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:54 +02:00
Philipp Reisner aaae506d54 drbd: Fixed a race condition between detach and start of resync
drbd_state_lock() is only there to serialize cluster wide state
changes. Testing the local disk state needs to happen while
holding the global_state_lock.

Otherwise you might see something like this (Oct 6 on kugel)
14:20:24 drbd0: conn( WFSyncUUID -> Connected ) disk( Inconsistent -> Failed )
14:20:24 drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
14:20:24 drbd0: conn( Connected -> SyncTarget ) disk( Failed -> Inconsistent )

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:53 +02:00
Lars Ellenberg 6a9a92f4ef drbd: fix harmless race to not trigger an ASSERT
We have one pre-allocated page to do certain synchronous meta data IO with,
using it is serialized like so:
	drbd_md_get_buffer();
	drbd_md_sync_page_io();
	drbd_md_sync_page_io();
	...
	drbd_md_put_buffer();

In drbd_md_sync_page_io() there is an
	ASSERT(atomic_read(&mdev->md_io_in_use) == 1);

We want to be able to timeout on unresponsive lower level devices, so we
can "detach" in that case. Inside drbd_md_sync_page_io() we grab an extra
reference, to not have a dangling pointer in case a delayed IO eventually
does still complete, even after we "detached" already.

We need to put the extra reference before we signal completion from the
completion handler, or the second drbd_md_sync_page_io() above may
trigger the assert (reference count still 2).

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:52 +02:00
Philipp Reisner 5ba3dac521 drbd: Derive sync-UUIDs only from the bitmap-uuid if it is non-zero
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:50 +02:00
Andreas Gruenbacher 7b4e4d3126 drbd: drbd_nl_resize(): Fix missing put_ldev() on error path
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:49 +02:00
Lars Ellenberg 40424e4a24 drbd: fix "stalled" empty resync
With sync-after dependencies, given "lucky" timing of pause/unpause
events, and the end of an empty (0 bits set) resync was sometimes not
detected on the SyncTarget, leading to a "stalled" SyncSource state.

Fixed this by expecting not only "Inconsistent -> UpToDate" but also
"Consistent -> UpToDate" transitions for the peer disk state
to end a resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:47 +02:00
Philipp Reisner 1e86ac48af drbd: Bugfix for the connection behavior
If we get into the C_BROKEN_PIPE cstate once, the state engine set the
thi->t_state of the receiver thread to restarting.  But with the while loop
in drbdd_init() a new connection gets established. After the call into
drbdd() returns immediately since the thi->t_state is not RUNNING.  The
restart of drbd_init() then resets thi->t_state to RUNNING.

I.e. after entering C_BROKEN_PIPE once, the next successful established
connection gets wasted.

The two parts of the fix:
  * Do not cause the thread to restart if we detect the issue
    with the sockets while we are in C_WF_CONNECTION.

  * Make sure that all actions that would have set us to C_BROKEN_PIPE
    happen before the state change to C_WF_REPORT_PARAMS.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:46 +02:00
Philipp Reisner 80f9fd55a6 drbd: Cleanup all epoch objects upon connection loss
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:44 +02:00
Philipp Reisner fd2491f4a4 drbd: detach must not try to abort non-local requests from drbd-8.4
Cherry picked form 8.4

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:43 +02:00
Philipp Reisner 79f16f5dbc drbd: Consider that the no-data-condition could be in connected state
...when the peer has inconsistent data. In that case we failed to
clear the susp_nod flag. When the local disk was attached again

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:15:42 +02:00
Philipp Reisner bca482e90b drbd: Fixed current UUID generation
Now, the new edition of the clause only fires if a diskless
peer gets promoted.

This is a fixup for "drbd: Delayed creation of current-UUID".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:50 +02:00
Lars Ellenberg 22f46ce2ef drbd: change some GFP_KERNEL to GFP_NOIO
Bitmap IO may happend in the context of an application write,
in the generic block IO path.  We need to use GFP_NOIO.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:47 +02:00
Philipp Reisner dfa8bedbfe drbd: Implemented the disk-timeout option
When the disk-timeout is active, and it expires for a single request,
we consider the local disk as D_FAILED. Note: With this change,
I made both timeout based state transitions HARD state transitions.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:45 +02:00
Philipp Reisner 02ee8f95fa drbd: Force flag for the detach operation
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:38 +02:00
Philipp Reisner 5ca1de0384 drbd: Allow new IOs while the local disk in in FAILED state
The last bunch of commits prepared the 'detach from tar pit' feature.
With that we can be for long time in disk state FAILED. We need
to accept new IO requests during that time.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:34 +02:00
Philipp Reisner 9e58c4dad7 drbd: Bitmap IO functions can now return prematurely if the disk breaks
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:33 +02:00
Philipp Reisner d1f3779bbe drbd: Added a kref to bm_aio_ctx
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:37:19 +02:00
Philipp Reisner b2057629ea drbd: Hold a reference to ldev while doing meta-data IO
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:31:11 +02:00
Philipp Reisner 4a2fe568b5 drbd: Keep a reference to the bio until the completion handler finished
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:28:51 +02:00
Philipp Reisner 0c46442515 drbd: Implemented wait_until_done_or_disk_failure()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:26:51 +02:00
Philipp Reisner e17117310b drbd: Replaced md_io_mutex by an atomic: md_io_in_use
The new function drbd_md_get_buffer() aborts waiting for the buffer
in case the disk failes in the meantime.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:22:31 +02:00
Philipp Reisner cc94c65015 drbd: moved md_io into mdev
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:17:24 +02:00
Philipp Reisner 2b4dd36fba drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:16:04 +02:00
Philipp Reisner 6d7e32f568 drbd: Keep a reference to barrier acked requests
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:15:28 +02:00
Philipp Reisner 6809384c71 drbd: Improve compatibility with drbd's older than 8.3.7
Regression introduced with 8.3.11 commit:
drbd: Take a more conservative approach when deciding max_bio_size

Never ever tell an older drbd, that we support more than 32KiB
in a single data request (packet).
Never believe an older drbd, that is supports more than 32KiB
in a single data request (packet)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:08:57 +02:00
Philipp Reisner 77e8fdfc18 drbd: Only print sanitize state's warnings, if the state change happens
The reason for this change is that, with when doing
'drbdadm invalidate' on a disconnected resource caused
an "implicitly set pdsk from UpToDate to DUnknown" message,
which was missleading.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:08:22 +02:00
Lars Ellenberg 07667347c8 drbd: downgraded error printk to info
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:05:25 +02:00
David Howells 5f138ce01a DRBD: Fix comparison always false warning due to long/long long compare
Fix warnings of the following nature in the drbd header:

In file included from drivers/block/drbd/drbd_bitmap.c:32:
drivers/block/drbd/drbd_int.h: In function 'drbd_get_syncer_progress':
drivers/block/drbd/drbd_int.h:2234: warning: comparison is always false due to limited range of data

where mdev->rs_total (an unsigned long) is being compared to 1ULL << 32, which
is always false on a 32-bit machine.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2012-05-09 10:03:19 +02:00
Lars Ellenberg 7948bcdc38 drbd: spelling fix: too small
It is not "to small", but "too small".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:02:22 +02:00
Lars Ellenberg 1381e9a496 drbd: cosmetic: fix accidental division instead of modulo when pretty printing
For large resync rates, seq_printf_with_thousands_grouping()
accidentally only produced Y,000,00Y, instead of the real numbers.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:01:39 +02:00
Philipp Reisner ebd2b0cde5 drbd: Lower log priority for an event that is definitely not an error
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 09:59:29 +02:00
Sage Weil 3469ac1aa3 ceph: drop support for preferred_osd pgs
This was an ill-conceived feature that has been removed from Ceph.  Do
this gracefully:

 - reject attempts to specify a preferred_osd via the ioctl
 - stop exposing this information via virtual xattrs
 - always fill in -1 for requests, in case we talk to an older server
 - don't calculate preferred_osd placements/pgids

Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2012-05-07 15:33:36 -07:00
David S. Miller f24001941c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Fix merge between commit 3adadc08cc ("net ax25: Reorder ax25_exit to
remove races") and commit 0ca7a4c87d ("net ax25: Simplify and
cleanup the ax25 sysctl handling")

The former moved around the sysctl register/unregister calls, the
later simply removed them.

With help from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-23 23:15:17 -04:00
Pavel Emelyanov 4a17fd5229 sock: Introduce named constants for sk_reuse
Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-21 15:52:25 -04:00
Linus Torvalds c1acb0ba33 Fixes in various components:
* mechanism to work with misconfigured backends (where they are
    advertised but in reality don't exist).
  * two tiny compile warning fixes.
  * proper error handling in gnttab_resume
  * Not using VM_PFNMAP anymore to allow backends in the same domain.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPkYeqAAoJEFjIrFwIi8fJEXIIAI+PYLNMcHTc4bxa6pErpKaS
 rq5eCXL9+EaZOwTUqHRJjfrjnlAc+BWO8lN0H41oRQWFYh14hgfUVJ+ziEujb1kw
 N1eTMVHnH/XRJV6rIFX+TiBasnyoMmNfWEAb45UL1nEUTMPL1Jv7AiRY/GxUlHyg
 M+uFG52KP3ytXxcIiGW6pYEqJd6UgWrqnclaeg5TR5zvDlWfJbUIBEMQ/PyV0WSS
 4e7biiwi4XPWT2f1qewOmI+3r68CltU3GAs1XxjcSX+bYYuh00UtY39AsBWo2N8I
 1VORuq0QPs+GB22r3e47IqBcjXkBGRIf6w1e/5a6WLiq7TqVq4bYgCGUmWT8V7o=
 =olCU
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull xen fixes from Konrad Rzeszutek Wilk:
 - mechanism to work with misconfigured backends (where they are
   advertised but in reality don't exist).
 - two tiny compile warning fixes.
 - proper error handling in gnttab_resume
 - Not using VM_PFNMAP anymore to allow backends in the same domain.

* tag 'stable/for-linus-3.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  Revert "xen/p2m: m2p_find_override: use list_for_each_entry_safe"
  xen/resume: Fix compile warnings.
  xen/xenbus: Add quirk to deal with misconfigured backends.
  xen/blkback: Fix warning error.
  xen/p2m: m2p_find_override: use list_for_each_entry_safe
  xen/gntdev: do not set VM_PFNMAP
  xen/grant-table: add error-handling code on failure of gnttab_resume
2012-04-20 11:31:00 -07:00
Konrad Rzeszutek Wilk a71e23d992 xen/blkback: Fix warning error.
drivers/block/xen-blkback/xenbus.c: In function 'xen_blkbk_discard':
drivers/block/xen-blkback/xenbus.c:419:4: warning: passing argument 1 of 'dev_warn' makes pointer from integer without a cast
+[enabled by default]
include/linux/device.h:894:5: note: expected 'const struct device *' but argument is of type 'long int'

It is unclear how that mistake made it in. It surely is wrong.

Acked-by: Jens Axboe <axboe@kernel.dk>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-18 15:54:08 -04:00
Linus Torvalds cdd5983063 virtio: fixes on top of 3.4-rc2
Here are some virtio fixes for 3.4:
 a test build fix, a patch by Ren fixing naming for systems with a massive
 number of virtio blk devices, and balloon fixes for powerpc
 by David Gibson.
 
 There was some discussion about Ren's patch for virtio disc naming: some people
 wanted to move the legacy name mangling function to the block core.  But
 there's no concensus on that yet, and we can always deduplicate later.
 Added comments in the hope that this will stop people from
 copying this legacy naming scheme into future drivers.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPio1GAAoJECgfDbjSjVRpGDAH/3C/bXm9mriuNauRHwktHgJe
 gmh2BfUgnxly6vheuz0Fv61lTe6V8kekHVolbUYwAUgXeWEKK1C59xehrMGRIPDG
 1XUiti50U3P+skhIfrbkS5nZ7L+5Hk0ToQ6dd9v0BM2GxDOvgwidlY1bZe+wJEZf
 Lvl6w/djBCr1e3k4qfRnpTcdJJ4FnOjGbikLQhSTGfUXeNo6uWS1hljYWnAhzFkd
 1xU8h5PP0TDR0nYb80CeB+9Lxw0w4qyNPJIBhNN6ucB/1U6R+55HpEpmrLUkn910
 sEFEFsc0cRVWr8FiOTlmzxLHnwTc8AY/Bsp9TMSmnTRu3ZQcoQMTQQCczRj04xI=
 =VmpJ
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael S. Tsirkin:
 "Here are some virtio fixes for 3.4: a test build fix, a patch by Ren
  fixing naming for systems with a massive number of virtio blk devices,
  and balloon fixes for powerpc by David Gibson.

  There was some discussion about Ren's patch for virtio disc naming:
  some people wanted to move the legacy name mangling function to the
  block core.  But there's no concensus on that yet, and we can always
  deduplicate later.  Added comments in the hope that this will stop
  people from copying this legacy naming scheme into future drivers."

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_balloon: fix handling of PAGE_SIZE != 4k
  virtio_balloon: Fix endian bug
  virtio_blk: helper function to format disk names
  tools/virtio: fix up vhost/test module build
2012-04-16 18:34:12 -07:00
Linus Torvalds c104f1fa1e Merge branch 'for-3.4/drivers' of git://git.kernel.dk/linux-block
Pull block driver bits from Jens Axboe:

 - A series of fixes for mtip32xx.  Most from Asai at Micron, but also
   one from Greg, getting rid of the dependency on PCIE_HOTPLUG.

 - A few bug fixes for xen-blkfront, and blkback.

 - A virtio-blk fix for Vivek, making resize actually work.

 - Two fixes from Stephen, making larger transfers possible on cciss.
   This is needed for tape drive support.

* 'for-3.4/drivers' of git://git.kernel.dk/linux-block:
  block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancy
  mtip32xx: dump tagmap on failure
  mtip32xx: fix handling of commands in various scenarios
  mtip32xx: Shorten macro names
  mtip32xx: misc changes
  mtip32xx: Add new sysfs entry 'status'
  mtip32xx: make setting comp_time as common
  mtip32xx: Add new bitwise flag 'dd_flag'
  mtip32xx: fix error handling in mtip_init()
  virtio-blk: Call revalidate_disk() upon online disk resize
  xen/blkback: Make optional features be really optional.
  xen/blkback: Squash the discard support for 'file' and 'phy' type.
  mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize and start port in mtip_restart_port()
  cciss: Fix scsi tape io with more than 255 scatter gather elements
  cciss: Initialize scsi host max_sectors for tape drive support
  xen-blkfront: make blkif_io_lock spinlock per-device
  xen/blkfront: don't put bdev right after getting it
  xen-blkfront: use bitmap_set() and bitmap_clear()
  xen/blkback: Enable blkback on HVM guests
  xen/blkback: use grant-table.c hypercall wrappers
2012-04-13 18:45:13 -07:00
Ren Mingxin c0aa3e0916 virtio_blk: helper function to format disk names
The current virtio block's naming algorithm just supports 18278
(26^3 + 26^2 + 26) disks. If there are more virtio blocks,
there will be disks with the same name.

Based on commit 3e1a7ff8a0, add
a function "virtblk_name_format()" for virtio block to support mass
of disks naming.

Notes:
- Our naming scheme is ugly. We are stuck with it
  for virtio but don't use it for any new driver:
  new drivers should name their devices PREFIX%d
  where the sequence number can be allocated by ida
- sd_format_disk_name has exactly the same logic.
  Moving it to a central place was deferred over worries
  that this will make people keep using the legacy naming
  in new drivers.
  We kept code idential in case someone wants to deduplicate later.

Signed-off-by: Ren Mingxin <renmx@cn.fujitsu.com>
Acked-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-04-12 10:37:05 +03:00
Greg Kroah-Hartman 6363480651 block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancy
This removes the HOTPLUG_PCI_PCIE dependency on the driver and makes it
depend on PCI.

Cc: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-12 08:47:05 +02:00
Asai Thambi S P 95fea2f1d9 mtip32xx: dump tagmap on failure
Dump tagmap on failure, instead of individual tags.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-09 08:35:39 +02:00
Asai Thambi S P c74b0f586f mtip32xx: fix handling of commands in various scenarios
* If a ncq  command time out and a non-ncq command is active, skip restart port
* Queue(pause) ncq commands during operations spanning more than one non-ncq commands - secure erase, download microcode
* When a non-ncq command is active, allow incoming non-ncq commands to wait instead of failing back
* Changed timeout for download microcode and smart commands
* If the device in write protect mode, fail all writes (do not send to device)
* Set maximum retries to 2

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-09 08:35:39 +02:00
Asai Thambi S P 8a857a880b mtip32xx: Shorten macro names
Shortened macros used to represent mtip_port->flags and dd->dd_flag

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-09 08:35:38 +02:00
Asai Thambi S P 8182b49528 mtip32xx: misc changes
* Handle the interrupt completion of polled internal commands
* Do not check remove pending flag for standby command
* On rebuild failure,
    - set corresponding bit dd_flag
    - do not send standby command
* Free ida index in remove path

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-09 08:35:38 +02:00
Asai Thambi S P f65872177d mtip32xx: Add new sysfs entry 'status'
* Add support for detecting the following device status
        - write protect
        - over temp (thermal shutdown)
* Add new sysfs entry 'status', possible values - online, write_protect, thermal_shutdown
* Add new file 'sysfs-block-rssd' to document ABI (Reported-by: Greg Kroah-Hartman)

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-09 08:35:38 +02:00
Asai Thambi S P dad40f16ff mtip32xx: make setting comp_time as common
Moved setting completion time into mtip_issue_ncq_command()

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-09 08:35:38 +02:00
Asai Thambi S P 45038367c2 mtip32xx: Add new bitwise flag 'dd_flag'
* Merged the following flags into one variable 'dd_flag':
        * drv_cleanup_done
        * resumeflag
* Added the following flags into 'dd_flag'
        * remove pending
        * init done
* Removed 'ftlrebuildflag' (similar flag is already part of mti_port->flags)

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-09 08:35:38 +02:00
Linus Torvalds 9479f0f801 Two fixes for regressions:
* one is a workaround that will be removed in v3.5 with proper fix in the tip/x86 tree,
  * the other is to fix drivers to load on PV (a previous patch made them only
    load in PVonHVM mode).
 
 The rest are just minor fixes in the various drivers and some cleanup in the
 core code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPfyVUAAoJEFjIrFwIi8fJUjUH/jbY5JavRqSlNELZW2A4Ta76
 8p00LqLHw/C56iHZcWKke8mqtWNb+ZfcQt7ZYcxDIYa4QWBL28x0OLAO2tOBIt37
 ZjYESWSdFJaJvmpADluWtFyGyZ9TYJllDTBm/jWj1ZtKSZvR1YkhuMXCS0f4AmGQ
 xFzSWJZUDdiOAqpN+VQD8wP00gfR8knQLg16XE2fvFdQo4XwpCtqLfHV/5pMMGdy
 Cs/ep6rq/7cdv/nshKOcBnw7RW8l3Xoi/28ht8k3DvAQ2VtFq1Tugv2G9pcCHwQG
 DIBkB3SOU6/v6P5at5+egKS5xR1fJetCWlkMd8kkbcdz2NPI4UDMkvOW6Q8yQls=
 =6Ve+
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull xen fixes from Konrad Rzeszutek Wilk:
 "Two fixes for regressions:
   * one is a workaround that will be removed in v3.5 with proper fix in
     the tip/x86 tree,
   * the other is to fix drivers to load on PV (a previous patch made
     them only load in PVonHVM mode).

  The rest are just minor fixes in the various drivers and some cleanup
  in the core code."

* tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
  xen/pciback: fix XEN_PCI_OP_enable_msix result
  xen/smp: Remove unnecessary call to smp_processor_id()
  xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'
  xen: only check xen_platform_pci_unplug if hvm
2012-04-06 17:54:53 -07:00
Igor Mammedov e95ae5a493 xen: only check xen_platform_pci_unplug if hvm
commit b9136d207f08
  xen: initialize platform-pci even if xen_emul_unplug=never

breaks blkfront/netfront by not loading them because of
xen_platform_pci_unplug=0 and it is never set for PV guest.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-06 12:12:52 -04:00
Alex Elder cd9d9f5df6 rbd: don't hold spinlock during messenger flush
A recent change made changes to the rbd_client_list be protected by
a spinlock.  Unfortunately in rbd_put_client(), the lock is taken
before possibly dropping the last reference to an rbd_client, and on
the last reference that eventually calls flush_workqueue() which can
sleep.

The problem was flagged by a debug spinlock warning:
    BUG: spinlock wrong CPU on CPU#3, rbd/27814

The solution is to move the spinlock acquisition and release inside
rbd_client_release(), which is the spot where it's really needed for
protecting the removal of the rbd_client from the client list.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-04-05 15:43:58 -05:00
Ryosuke Saito 6d27f09a63 mtip32xx: fix error handling in mtip_init()
Ensure that block device is properly unregistered, if
pci_register_driver() fails.

Signed-off-by: Ryosuke Saito <raitosyo@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-04-05 08:09:34 -06:00
Len Brown f6365201d8 x86: Remove the ancient and deprecated disable_hlt() and enable_hlt() facility
The X86_32-only disable_hlt/enable_hlt mechanism was used by the
32-bit floppy driver. Its effect was to replace the use of the
HLT instruction inside default_idle() with cpu_relax() - essentially
it turned off the use of HLT.

This workaround was commented in the code as:

 "disable hlt during certain critical i/o operations"

 "This halt magic was a workaround for ancient floppy DMA
  wreckage. It should be safe to remove."

H. Peter Anvin additionally adds:

 "To the best of my knowledge, no-hlt only existed because of
  flaky power distributions on 386/486 systems which were sold to
  run DOS.  Since DOS did no power management of any kind,
  including HLT, the power draw was fairly uniform; when exposed
  to the much hhigher noise levels you got when Linux used HLT
  caused some of these systems to fail.

  They were by far in the minority even back then."

Alan Cox further says:

 "Also for the Cyrix 5510 which tended to go castors up if a HLT
  occurred during a DMA cycle and on a few other boxes HLT during
  DMA tended to go astray.

  Do we care ? I doubt it. The 5510 was pretty obscure, the 5520
  fixed it, the 5530 is probably the oldest still in any kind of
  use."

So, let's finally drop this.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Stephen Hemminger <shemminger@vyatta.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/n/tip-3rhk9bzf0x9rljkv488tloib@git.kernel.org
[ If anyone cares then alternative instruction patching could be
  used to replace HLT with a one-byte NOP instruction. Much simpler. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-30 08:50:27 +02:00
Vivek Goyal e9986f303d virtio-blk: Call revalidate_disk() upon online disk resize
If a virtio disk is open in guest and a disk resize operation is done,
(virsh blockresize), new size is not visible to tools like "fdisk -l".
This seems to be happening as we update only part->nr_sects and not
bdev->bd_inode size.

Call revalidate_disk() which should take care of it. I tested growing disk
size of already open disk and it works for me.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-29 10:09:44 +02:00
Linus Torvalds 532bfc851a Merge branch 'akpm' (Andrew's patch-bomb)
Merge third batch of patches from Andrew Morton:
 - Some MM stragglers
 - core SMP library cleanups (on_each_cpu_mask)
 - Some IPI optimisations
 - kexec
 - kdump
 - IPMI
 - the radix-tree iterator work
 - various other misc bits.

 "That'll do for -rc1.  I still have ~10 patches for 3.4, will send
  those along when they've baked a little more."

* emailed from Andrew Morton <akpm@linux-foundation.org>: (35 commits)
  backlight: fix typo in tosa_lcd.c
  crc32: add help text for the algorithm select option
  mm: move hugepage test examples to tools/testing/selftests/vm
  mm: move slabinfo.c to tools/vm
  mm: move page-types.c from Documentation to tools/vm
  selftests/Makefile: make `run_tests' depend on `all'
  selftests: launch individual selftests from the main Makefile
  radix-tree: use iterators in find_get_pages* functions
  radix-tree: rewrite gang lookup using iterator
  radix-tree: introduce bit-optimized iterator
  fs/proc/namespaces.c: prevent crash when ns_entries[] is empty
  nbd: rename the nbd_device variable from lo to nbd
  pidns: add reboot_pid_ns() to handle the reboot syscall
  sysctl: use bitmap library functions
  ipmi: use locks on watchdog timeout set on reboot
  ipmi: simplify locking
  ipmi: fix message handling during panics
  ipmi: use a tasklet for handling received messages
  ipmi: increase KCS timeouts
  ipmi: decrease the IPMI message transaction time in interrupt mode
  ...
2012-03-28 17:19:28 -07:00
Wanlong Gao f4507164e7 nbd: rename the nbd_device variable from lo to nbd
rename the nbd_device variable from "lo" to "nbd", since "lo" is just a name
copied from loop.c.

Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28 17:14:37 -07:00
Linus Torvalds 0195c00244 Disintegrate and delete asm/system.h
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk
 8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m
 u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB
 ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m
 rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl
 eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45
 HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+
 /5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9
 Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK
 4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR
 FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD
 NypQthI85pc=
 =G9mT
 -----END PGP SIGNATURE-----

Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system

Pull "Disintegrate and delete asm/system.h" from David Howells:
 "Here are a bunch of patches to disintegrate asm/system.h into a set of
  separate bits to relieve the problem of circular inclusion
  dependencies.

  I've built all the working defconfigs from all the arches that I can
  and made sure that they don't break.

  The reason for these patches is that I recently encountered a circular
  dependency problem that came about when I produced some patches to
  optimise get_order() by rewriting it to use ilog2().

  This uses bitops - and on the SH arch asm/bitops.h drags in
  asm-generic/get_order.h by a circuituous route involving asm/system.h.

  The main difficulty seems to be asm/system.h.  It holds a number of
  low level bits with no/few dependencies that are commonly used (eg.
  memory barriers) and a number of bits with more dependencies that
  aren't used in many places (eg.  switch_to()).

  These patches break asm/system.h up into the following core pieces:

    (1) asm/barrier.h

        Move memory barriers here.  This already done for MIPS and Alpha.

    (2) asm/switch_to.h

        Move switch_to() and related stuff here.

    (3) asm/exec.h

        Move arch_align_stack() here.  Other process execution related bits
        could perhaps go here from asm/processor.h.

    (4) asm/cmpxchg.h

        Move xchg() and cmpxchg() here as they're full word atomic ops and
        frequently used by atomic_xchg() and atomic_cmpxchg().

    (5) asm/bug.h

        Move die() and related bits.

    (6) asm/auxvec.h

        Move AT_VECTOR_SIZE_ARCH here.

  Other arch headers are created as needed on a per-arch basis."

Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that.  We'll find out anything that got broken and fix it..

* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
  Delete all instances of asm/system.h
  Remove all #inclusions of asm/system.h
  Add #includes needed to permit the removal of asm/system.h
  Move all declarations of free_initmem() to linux/mm.h
  Disintegrate asm/system.h for OpenRISC
  Split arch_align_stack() out from asm-generic/system.h
  Split the switch_to() wrapper out of asm-generic/system.h
  Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
  Create asm-generic/barrier.h
  Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
  Disintegrate asm/system.h for Xtensa
  Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
  Disintegrate asm/system.h for Tile
  Disintegrate asm/system.h for Sparc
  Disintegrate asm/system.h for SH
  Disintegrate asm/system.h for Score
  Disintegrate asm/system.h for S390
  Disintegrate asm/system.h for PowerPC
  Disintegrate asm/system.h for PA-RISC
  Disintegrate asm/system.h for MN10300
  ...
2012-03-28 15:58:21 -07:00
Linus Torvalds 47b816ff7d Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull a few more things for powerpc by Benjamin Herrenschmidt:
 - Anton's did some recent improvements to EPOW event reporting on
   pSeries (power supply failures and such).  The patches are self
   contained enough and replace really nasty code so I felt it should
   still go in
 - I did the vio driver registration change Greg requested, I don't see
   the point of leaving that til the next merge window
 - The remaining EEH changes I said were still pending to get rid of the
   EEH references from the generic struct device_node
 - A few more iSeries removal bits
 - A perf bug fix on 970

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/perf: Fix instruction address sampling on 970 and Power4
  powerpc+sparc/vio: Modernize driver registration
  powerpc: Random little legacy iSeries removal tidy ups
  powerpc: Remove NO_IRQ_IGNORE
  powerpc/pseries: Cut down on enthusiastic use of defines in RAS code
  powerpc/pseries: Clean up ras_error_interrupt code
  powerpc/pseries: Remove RTAS_POWERMGM_EVENTS
  powerpc/pseries: Use rtas_get_sensor in RAS code
  powerpc/pseries: Parse and handle EPOW interrupts
  powerpc: Make function that parses RTAS error logs global
  powerpc/eeh: Retrieve PHB from global list
  powerpc/eeh: Remove eeh information from pci_dn
  powerpc/eeh: Remove eeh device from OF node
2012-03-28 14:41:36 -07:00
David Howells 9ffc93f203 Remove all #inclusions of asm/system.h
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28 18:30:03 +01:00
Linus Torvalds 56b59b429b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates for 3.4-rc1 from Sage Weil:
 "Alex has been busy.  There are a range of rbd and libceph cleanups,
  especially surrounding device setup and teardown, and a few critical
  fixes in that code.  There are more cleanups in the messenger code,
  virtual xattrs, a fix for CRC calculation/checks, and lots of other
  miscellaneous stuff.

  There's a patch from Amon Ott to make inos behave a bit better on
  32-bit boxes, some decode check fixes from Xi Wang, and network
  throttling fix from Jim Schutt, and a couple RBD fixes from Josh
  Durgin.

  No new functionality, just a lot of cleanup and bug fixing."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (65 commits)
  rbd: move snap_rwsem to the device, rename to header_rwsem
  ceph: fix three bugs, two in ceph_vxattrcb_file_layout()
  libceph: isolate kmap() call in write_partial_msg_pages()
  libceph: rename "page_shift" variable to something sensible
  libceph: get rid of zero_page_address
  libceph: only call kernel_sendpage() via helper
  libceph: use kernel_sendpage() for sending zeroes
  libceph: fix inverted crc option logic
  libceph: some simple changes
  libceph: small refactor in write_partial_kvec()
  libceph: do crc calculations outside loop
  libceph: separate CRC calculation from byte swapping
  libceph: use "do" in CRC-related Boolean variables
  ceph: ensure Boolean options support both senses
  libceph: a few small changes
  libceph: make ceph_tcp_connect() return int
  libceph: encapsulate some messenger cleanup code
  libceph: make ceph_msgr_wq private
  libceph: encapsulate connection kvec operations
  libceph: move prepare_write_banner()
  ...
2012-03-28 10:01:29 -07:00
Benjamin Herrenschmidt cb52d8970e powerpc+sparc/vio: Modernize driver registration
This makes vio_register_driver() get the module owner & name at compile
time like PCI drivers do, and adds a name pointer directly in struct
vio_driver to avoid having to explicitly initialize the embedded
struct device.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David S. Miller <davem@davemloft.net>
2012-03-28 11:33:24 +11:00
Jens Axboe 6674fb79ca Merge branch 'stable/for-jens-3.4-bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.4/drivers
Konrad writes:

I've two small fixes for the xen-blkback - and I think one more will show up
eventually (a partial revert), but not sure when. So in the spirit of keeping
the patches flowing, please git pull the following branch.
2012-03-26 09:13:14 +02:00
Linus Torvalds e22057c859 One tiny feature that accidentally got lost in the initial git pull:
* Add fast-EOI acking of interrupts (clear a bit instead of hypercall)
 And bug-fixes:
  * Fix CPU bring-up code missing a call to notify other subsystems.
  * Fix reading /sys/hypervisor even if PVonHVM drivers are not loaded.
  * In Xen ACPI processor driver: remove too verbose WARN messages, fix up
    the Kconfig dependency to be a module by default, and add dependency on
    CPU_FREQ.
  * Disable CPU frequency drivers from loading when booting under Xen
    (as we want the Xen ACPI processor to be used instead).
  * Cleanups in tmem code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPbc3DAAoJEFjIrFwIi8fJTQkIAMnH2fPhcHAb4mNaz+3gdmsZ
 Flo6V1gMBcO8xKZlUkFgKKPYoOm7lLmvoceXLVSH5oOKSnSJo1zSinzKmcdJQo/D
 kPo4/EguNwtzcAcQh2dmT6/IM9O3ihMKUli7Oajif9PLCFFFqTaG3Y3YNBo/rxTY
 D3HAnJrIfmIyG0NpLnaFCWhCzUvcB4M7ysutECqcF8l5gnbHxRVeCKD0blM+n9GH
 Wyum00dQCwo6h6wTduhPOAxHAM4rncyR3heOB2vDxq9YJHSUhhcva5QCgQ+tdUVt
 6U2TQT1L2Px8iXXzr2w9YBpepOVajZReoKhajLjJ5VbkpBZFz5dVNfJ8LpF8RV8=
 =z8IB
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.4-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull more xen updates from Konrad Rzeszutek Wilk:
 "One tiny feature that accidentally got lost in the initial git pull:
   * Add fast-EOI acking of interrupts (clear a bit instead of
     hypercall)
  And bug-fixes:
   * Fix CPU bring-up code missing a call to notify other subsystems.
   * Fix reading /sys/hypervisor even if PVonHVM drivers are not loaded.
   * In Xen ACPI processor driver: remove too verbose WARN messages, fix
     up the Kconfig dependency to be a module by default, and add
     dependency on CPU_FREQ.
   * Disable CPU frequency drivers from loading when booting under Xen
     (as we want the Xen ACPI processor to be used instead).
   * Cleanups in tmem code."

* tag 'stable/for-linus-3.4-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/acpi: Fix Kconfig dependency on CPU_FREQ
  xen: initialize platform-pci even if xen_emul_unplug=never
  xen/smp: Fix bringup bug in AP code.
  xen/acpi: Remove the WARN's as they just create noise.
  xen/tmem: cleanup
  xen: support pirq_eoi_map
  xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
  xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
  provide disable_cpufreq() function to disable the API.
2012-03-24 12:20:25 -07:00
Konrad Rzeszutek Wilk 3389bb8bf7 xen/blkback: Make optional features be really optional.
They were using the xenbus_dev_fatal() function which would
change the state of the connection immediately. Which is not
what we want when we advertise optional features.

So make 'feature-discard','feature-barrier','feature-flush-cache'
optional.

Suggested-by: Jan Beulich <JBeulich@suse.com>
[v1: Made the discard function void and static]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-24 10:04:36 -04:00
Konrad Rzeszutek Wilk 4dae76705f xen/blkback: Squash the discard support for 'file' and 'phy' type.
The only reason for the distinction was for the special case of
'file' (which is assumed to be loopback device), was to reach inside
the loopback device, find the underlaying file, and call fallocate on it.
Fortunately "xen-blkback: convert hole punching to discard request on
loop devices" removes that use-case and we now based the discard
support based on blk_queue_discard(q) and extract all appropriate
parameters from the 'struct request_queue'.

CC: Li Dongyang <lidongyang@novell.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
[v1: Dropping pointless initializer and keeping blank line]
[v2: Remove the kfree as it is not used anymore]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-24 10:04:35 -04:00
Oleg Nesterov 70834d3070 usermodehelper: use UMH_WAIT_PROC consistently
A few call_usermodehelper() callers use the hardcoded constant instead of
the proper UMH_WAIT_PROC, fix them.

Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-23 16:58:41 -07:00
Asai Thambi S P 22be2e6e13 mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize and start port in mtip_restart_port()
This patch includes two changes:
	* fix incorrect value set for drv_cleanup_done
	* re-initialize and start port in mtip_restart_port()

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-23 12:33:03 +01:00
Stephen M. Cameron bc67f63650 cciss: Fix scsi tape io with more than 255 scatter gather elements
The total number of scatter gather elements in the CISS command
used by the scsi tape code was being cast to a u8, which can hold
at most 255 scatter gather elements.  It should have been cast to
a u16.  Without this patch the command gets rejected by the controller
since the total scatter gather count did not add up to the right
value resulting in an i/o error.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-22 21:40:09 +01:00
Stephen M. Cameron 395d287526 cciss: Initialize scsi host max_sectors for tape drive support
The default is too small (1024 blocks), use h->cciss_max_sectors (8192 blocks)
Without this change, if you try to set the block size of a tape drive above
512*1024, via "mt -f /dev/st0 setblk nnn" where nnn is greater than 524288,
it won't work right.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-22 21:40:08 +01:00
Josh Durgin c666601a93 rbd: move snap_rwsem to the device, rename to header_rwsem
A new temporary header is allocated each time the header changes, but
only the changed properties are copied over. We don't need a new
semaphore for each header update.

This addresses http://tracker.newdream.net/issues/2174

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:52 -05:00
Alex Elder 32eec68d2f rbd: don't drop the rbd_id too early
Currently an rbd device's id is released when it is removed, but it
is done before the code is run to clean up sysfs-related files (such
as /sys/bus/rbd/devices/1).

It's possible that an rbd is still in use after the rbd_remove()
call has been made.  It's essentially the same as an active inode
that stays around after it has been removed--until its final close
operation.  This means that the id shows up as free for reuse at a
time it should not be.

The effect of this was seen by Jens Rehpoehler, who:
    - had a filesystem mounted on an rbd device
    - unmapped that filesystem (without unmounting)
    - found that the mount still worked properly
    - but hit a panic when he attempted to re-map a new rbd device

This re-map attempt found the previously-unmapped id available.
The subsequent attempt to reuse it was met with a panic while
attempting to (re-)install the sysfs entry for the new mapped
device.

Fix this by holding off "putting" the rbd id, until the rbd_device
release function is called--when the last reference is finally
dropped.

Note: This fixes: http://tracker.newdream.net/issues/1907

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:50 -05:00
Alex Elder 593a9e7b34 rbd: small changes
Here is another set of small code tidy-ups:
    - Define SECTOR_SHIFT and SECTOR_SIZE, and use these symbolic
      names throughout.  Tell the blk_queue system our physical
      block size, in the (unlikely) event we want to use something
      other than the default.
    - Delete the definition of struct rbd_info, which is never used.
    - Move the definition of dev_to_rbd() down in its source file,
      just above where it gets first used, and change its name to
      dev_to_rbd_dev().
    - Replace an open-coded operation in rbd_dev_release() to use
      dev_to_rbd_dev() instead.
    - Calculate the segment size for a given rbd_device just once in
      rbd_init_disk().
    - Use the '%zd' conversion specifier in rbd_snap_size_show(),
      since the value formatted is a size_t.
    - Switch to the '%llu' conversion specifier in rbd_snap_id_show().
      since the value formatted is unsigned.

Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:50 -05:00
Alex Elder 00f1f36ffa rbd: do some refactoring
A few blocks of code are rearranged a bit here:
    - In rbd_header_from_disk():
	- Don't bother computing snap_count until we're sure the
	  on-disk header starts with a good signature.
	- Move a few independent lines of code so they are *after* a
	  check for a failed memory allocation.
	- Get rid of unnecessary local variable "ret".
    - Make a few other changes in rbd_read_header(), similar to the
      above--just moving things around a bit while preserving the
      functionality.
    - In rbd_rq_fn(), just assign rq in the while loop's controlling
      expression rather than duplicating it before and at the end of
      the loop body.  This allows the use of "continue" rather than
      "goto next" in a number of spots.
    - Rearrange the logic in snap_by_name().  End result is the same.

Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:50 -05:00
Alex Elder fed4c143ba rbd: fix module sysfs setup/teardown code
Once rbd_bus_type is registered, it allows an "add" operation via
the /sys/bus/rbd/add bus attribute, and adding a new rbd device that
way establishes a connection between the device and rbd_root_dev.
But rbd_root_dev is not registered until after the rbd_bus_type
registration is complete.  This could (in principle anyway) result
in an invalid state.

Since rbd_root_dev has no tie to rbd_bus_type we can reorder these
two initializations and never be faced with this scenario.

In addition, unregister the device in the event the bus registration
fails at module init time.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:50 -05:00
Alex Elder 7ef3214af2 rbd: don't allocate mon_addrs buffer in rbd_add()
The mon_addrs buffer in rbd_add is used to hold a copy of the
monitor IP addresses supplied via /sys/bus/rbd/add.  That is
passed to rbd_get_client(), which never modifies it (nor do
any of the functions it gets passed to thereafter)--the mon_addr
parameter to rbd_get_client() is a pointer to constant data, so it
can't be modifed.  Furthermore, rbd_get_client() has the length of
the mon_addrs buffer and that is used to ensure nothing goes beyond
its end.

Based on all this, there is no reason that a buffer needs to
be used to hold a copy of the mon_addrs provided via
/sys/bus/rbd/add.   Instead, the location within that passed-in
buffer can be provided, along with the length of the "token"
therein which represents the monitor IP's.

A small change to rbd_add_parse_args() allows the address within the
buffer to be passed back, and the length is already returned.  This
now means that, at least from the perspective of this interface,
there is no such thing as a list of monitor addresses that is too
long.

Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:50 -05:00
Alex Elder 5214ecc45c rbd: have rbd_parse_args() report found mon_addrs size
The argument parsing routine already computes the size of the
mon_addrs buffer it extracts from the "command."  Pass it to the
caller so it can use it to provide the length to rbd_get_client().

Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:49 -05:00
Alex Elder 81a8979378 rbd: do a few checks at build time
This is a bit gratuitous, but there are a few things that can be
verified at build time rather than run time, so do that.

Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:49 -05:00
Alex Elder e28fff268e rbd: don't use sscanf() in rbd_add_parse_args()
Make use of a few simple helper routines to parse the arguments
rather than sscanf().  This will treat both missing and too-long
arguments as invalid input (rather than silently truncating the
input in the too-long case).  In time this can also be used by
rbd_add() to use the passed-in buffer in place, rather than copying
its contents into new buffers.

It appears to me that the sscanf() previously used would not
correctly handle a supplied snapshot--the two final "%s" conversion
specifications were not separated by a space, and I'm not sure
how sscanf() handles that situation.  It may not be well-defined.
So that may be a bug this change fixes (but I didn't verify that).

The sizes of the mon_addrs and options buffers are now passed to
rbd_add_parse_args(), so they can be supplied to copy_token().

Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:49 -05:00
Alex Elder a725f65e52 rbd: encapsulate argument parsing for rbd_add()
Move the code that parses the arguments provided to rbd_add() (which
are supplied via /sys/bus/rbd/add) into a separate function.

Also rename the "mon_dev_name" variable in rbd_add() to be
"mon_addrs".   The variable represents a list of one or more
comma-separated monitor IP addresses, each with an optional port
number.  I think "mon_addrs" captures that notion a little better.

Signed-off-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:48 -05:00
Alex Elder 27cc25943f rbd: simplify error handling in rbd_add()
If a couple pointers are initialized to NULL then a single
"out_nomem" label can be used for all of the memory allocation
failure cases in rbd_add().

Also, get rid of the "irc" local variable there.  There is no
real need for "rc" to be type ssize_t, and it can be used in
the spot "irc" was.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:48 -05:00
Alex Elder 60571c7d55 rbd: reduce memory used for rbd_dev fields
The length of the string containing the monitor address
specification(s) will never exceed the length of the string passed
in to rbd_add().  The same holds true for the ceph + rbd options
string.  So reduce the amount of memory allocated for these to
that length rather than the maximum (1024 bytes).

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:48 -05:00
Alex Elder d720bcb0a8 rbd: have rbd_get_client() return a rbd_client
Since rbd_get_client() currently returns an error code.  It assigns
the rbd_client field of the rbd_device structure it is passed if
successful.  Instead, have it return the created rbd_client
structure and return a pointer-coded error if there is an error.
This makes the assignment of the client pointer more obvious at the
call site.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:48 -05:00
Alex Elder f0f8cef5a3 rbd: a few simple changes
Here are a few very simple cleanups:
    - Add a "RBD_" prefix to the two driver name string definitions.
    - Move the definition of struct rbd_request below struct rbd_req_coll
      to avoid the need for an empty declaration of the latter.
    - Move and group the definitions of rbd_root_dev_release() and
      rbd_root_dev, as well as rbd_bus_type and rbd_bus_attrs[],
      close to the top of the file.  Arrange the latter so
      rbd_bus_type.bus_attrs can be initialized statically.
    - Get rid of an unnecessary local variable in rbd_open().
    - Rework some hokey logic in rbd_bus_add_dev(), so the value of
      "ret" at the end is either 0 or -ENOENT to avoid the need for
      the code duplication that was there.
    - Rename a goto target in rbd_add().

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:48 -05:00
Alex Elder 432b858749 rbd: rename "node_lock"
The spinlock used to protect rbd_client_list is named "node_lock".
Rename it to "rbd_client_list_lock" to make it more obvious what
it's for.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:48 -05:00
Alex Elder bc534d86be rbd: move ctl_mutex lock inside rbd_client_create()
Since rbd_client_create() is only called in one place, move the
acquisition of the mutex around that call inside that function.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder d97081b0c7 rbd: move ctl_mutex lock inside rbd_get_client()
Since rbd_get_client() is only called in one place, move the
acquisition of the mutex around that call inside that function.

Furthermore, within rbd_get_client(), it appears the mutex only
needs to be held while calling rbd_client_create().  (Moving
the lock inside that function will wait for the next patch.)

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder e6994d3dde rbd: release client list lock sooner
In rbd_get_client(), if a client is reused, a number of things
get done while still holding the list lock unnecessarily.

This just moves a few things that need no lock protection outside
the lock.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder d184f6bfde rbd: restore previous rbd id sequence behavior
It used to be that selecting a new unique identifier for an added
rbd device required searching all existing ones to find the highest
id is used.  A recent change made that unnecessary, but made it
so that id's used were monotonically non-decreasing.  It's a bit
more pleasant to have smaller rbd id's though, and this change
makes ids get allocated as they were before--each new id is one more
than the maximum currently in use.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder 499afd5b8e rbd: tie rbd_dev_list changes to rbd_id operations
The only time entries are added to or removed from the global
rbd_dev_list is exactly when a "put" or "get" operation is being
performed on a rbd_dev's id.  So just move the list management code
into get/put routines.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder e124a82f3c rbd: protect the rbd_dev_list with a spinlock
The rbd_dev_list is just a simple list of all the current
rbd_devices.  Using the ctl_mutex as a concurrency guard is
overkill.  Instead, use a spinlock for that specific purpose.

This also reduces the window that the ctl_mutex needs to be held in
rbd_add().

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder 1ddbe94eda rbd: rework calculation of new rbd id's
In order to select a new unique identifier for an added rbd device,
the list of all existing ones is searched and a value one greater
than the highest id is used.

The list search can be avoided by using an atomic variable that
keeps track of the current highest id.  Using a get/put model for
id's we can limit the boundless growth of id numbers a bit by
arranging to reuse the current highest id once it gets released.
Add these calls to "put" the id when an rbd is getting removed.

Note that this changes the pattern of device id's used--new values
will never be below the highest one seen so far (even if there
exists an unused lower one).  I assert this is OK because the key
property of an rbd id is its uniqueness, not its magnitude.

Regardless, a follow-on patch will restore the old way of doing
things, I just think this commit just makes the incremental change
to atomics a little easier to understand.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder b7f23c361b rbd: encapsulate new rbd id selection
Move the loop that finds a new unique rbd id to use into
its own helper function.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Josh Durgin cc9d734c3d rbd: use a single value of snap_name to mean no snap
There's already a constant for this anyway.

Since rbd_header_set_snap() is only used to set the rbd device
snap_name field, just do that within that function rather than
having it take the snap_name as an argument.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>

v2: Changed interface rbd_header_set_snap() so it explicitly updates
    the snap_name in the rbd_device.  Also added a BUILD_BUG_ON()
    to verify the size of the snap_name field is sufficient for
    SNAP_HEAD_NAME.
2012-03-22 10:47:47 -05:00
Alex Elder 1dbb439913 rbd: do not duplicate ceph_client pointer in rbd_device
The rbd_device structure maintains a duplicate copy of the
ceph_client pointer maintained in its rbd_client structure.  There
appears to be no good reason for this, and its presence presents a
risk of them getting out of synch or otherwise misused.  So kill it
off, and use the rbd_client copy only.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder ee57741c52 rbd: make ceph_parse_options() return a pointer
ceph_parse_options() takes the address of a pointer as an argument
and uses it to return the address of an allocated structure if
successful.  With this interface is not evident at call sites that
the pointer is always initialized.  Change the interface to return
the address instead (or a pointer-coded error code) to make the
validity of the returned pointer obvious.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder 2107978668 rbd: a few small cleanups
Some minor cleanups in "drivers/block/rbd.c:
    - Use the more meaningful "RBD_MAX_OBJ_NAME_LEN" in place if "96"
      in the definition of RBD_MAX_MD_NAME_LEN.
    - Use DEFINE_SPINLOCK() to define and initialize node_lock.
    - Drop a needless (char *) cast in parse_rbd_opts_token().
    - Make a few minor formatting changes.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Igor Mammedov b9136d207f xen: initialize platform-pci even if xen_emul_unplug=never
When xen_emul_unplug=never is specified on kernel command line
reading files from /sys/hypervisor is broken (returns -EBUSY).
It is caused by xen_bus dependency on platform-pci and
platform-pci isn't initialized when xen_emul_unplug=never is
specified.

Fix it by allowing platform-pci to ignore xen_emul_unplug=never,
and do not intialize xen_[blk|net]front instead.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-22 11:37:11 -04:00
Linus Torvalds 5375871d43 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc merge from Benjamin Herrenschmidt:
 "Here's the powerpc batch for this merge window.  It is going to be a
  bit more nasty than usual as in touching things outside of
  arch/powerpc mostly due to the big iSeriesectomy :-) We finally got
  rid of the bugger (legacy iSeries support) which was a PITA to
  maintain and that nobody really used anymore.

  Here are some of the highlights:

   - Legacy iSeries is gone.  Thanks Stephen ! There's still some bits
     and pieces remaining if you do a grep -ir series arch/powerpc but
     they are harmless and will be removed in the next few weeks
     hopefully.

   - The 'fadump' functionality (Firmware Assisted Dump) replaces the
     previous (equivalent) "pHyp assisted dump"...  it's a rewrite of a
     mechanism to get the hypervisor to do crash dumps on pSeries, the
     new implementation hopefully being much more reliable.  Thanks
     Mahesh Salgaonkar.

   - The "EEH" code (pSeries PCI error handling & recovery) got a big
     spring cleaning, motivated by the need to be able to implement a
     new backend for it on top of some new different type of firwmare.

     The work isn't complete yet, but a good chunk of the cleanups is
     there.  Note that this adds a field to struct device_node which is
     not very nice and which Grant objects to.  I will have a patch soon
     that moves that to a powerpc private data structure (hopefully
     before rc1) and we'll improve things further later on (hopefully
     getting rid of the need for that pointer completely).  Thanks Gavin
     Shan.

   - I dug into our exception & interrupt handling code to improve the
     way we do lazy interrupt handling (and make it work properly with
     "edge" triggered interrupt sources), and while at it found & fixed
     a wagon of issues in those areas, including adding support for page
     fault retry & fatal signals on page faults.

   - Your usual random batch of small fixes & updates, including a bunch
     of new embedded boards, both Freescale and APM based ones, etc..."

I fixed up some conflicts with the generalized irq-domain changes from
Grant Likely, hopefully correctly.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (141 commits)
  powerpc/ps3: Do not adjust the wrapper load address
  powerpc: Remove the rest of the legacy iSeries include files
  powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces
  init: Remove CONFIG_PPC_ISERIES
  powerpc: Remove FW_FEATURE ISERIES from arch code
  tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable
  powerpc/spufs: Fix double unlocks
  powerpc/5200: convert mpc5200 to use of_platform_populate()
  powerpc/mpc5200: add options to mpc5200_defconfig
  powerpc/mpc52xx: add a4m072 board support
  powerpc/mpc5200: update mpc5200_defconfig to fit for charon board
  Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup
  powerpc/44x: Add additional device support for APM821xx SoC and Bluestone board
  powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board
  MAINTAINERS: Update PowerPC 4xx tree
  powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board
  powerpc: document the FSL MPIC message register binding
  powerpc: add support for MPIC message register API
  powerpc/fsl: Added aliased MSIIR register address to MSI node in dts
  powerpc/85xx: mpc8548cds - add 36-bit dts
  ...
2012-03-21 18:55:10 -07:00
Linus Torvalds 9f3938346a Merge branch 'kmap_atomic' of git://github.com/congwang/linux
Pull kmap_atomic cleanup from Cong Wang.

It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().

Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.

* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
  feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
  highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
  drbd: remove the second argument of k[un]map_atomic()
  zcache: remove the second argument of k[un]map_atomic()
  gma500: remove the second argument of k[un]map_atomic()
  dm: remove the second argument of k[un]map_atomic()
  tomoyo: remove the second argument of k[un]map_atomic()
  sunrpc: remove the second argument of k[un]map_atomic()
  rds: remove the second argument of k[un]map_atomic()
  net: remove the second argument of k[un]map_atomic()
  mm: remove the second argument of k[un]map_atomic()
  lib: remove the second argument of k[un]map_atomic()
  power: remove the second argument of k[un]map_atomic()
  kdb: remove the second argument of k[un]map_atomic()
  udf: remove the second argument of k[un]map_atomic()
  ubifs: remove the second argument of k[un]map_atomic()
  squashfs: remove the second argument of k[un]map_atomic()
  reiserfs: remove the second argument of k[un]map_atomic()
  ocfs2: remove the second argument of k[un]map_atomic()
  ntfs: remove the second argument of k[un]map_atomic()
  ...
2012-03-21 09:40:26 -07:00
Linus Torvalds 69a7aebcf0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
 "It's indeed trivial -- mostly documentation updates and a bunch of
  typo fixes from Masanari.

  There are also several linux/version.h include removals from Jesper."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits)
  kcore: fix spelling in read_kcore() comment
  constify struct pci_dev * in obvious cases
  Revert "char: Fix typo in viotape.c"
  init: fix wording error in mm_init comment
  usb: gadget: Kconfig: fix typo for 'different'
  Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c"
  writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header
  writeback: fix typo in the writeback_control comment
  Documentation: Fix multiple typo in Documentation
  tpm_tis: fix tis_lock with respect to RCU
  Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c"
  Doc: Update numastat.txt
  qla4xxx: Add missing spaces to error messages
  compiler.h: Fix typo
  security: struct security_operations kerneldoc fix
  Documentation: broken URL in libata.tmpl
  Documentation: broken URL in filesystems.tmpl
  mtd: simplify return logic in do_map_probe()
  mm: fix comment typo of truncate_inode_pages_range
  power: bq27x00: Fix typos in comment
  ...
2012-03-20 21:12:50 -07:00
Linus Torvalds ed378a52da USB merge for 3.4-rc1
Here's the big USB merge for the 3.4-rc1 merge window.
 
 Lots of gadget driver reworks here, driver updates, xhci changes, some
 new drivers added, usb-serial core reworking to fix some bugs, and other
 various minor things.
 
 There are some patches touching arch code, but they have all been acked
 by the various arch maintainers.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iEYEABECAAYFAk9njL8ACgkQMUfUDdst+ylQ9wCfbBOnIT01lGOorkaE9pom0hhk
 HfMAoKq1xzCR2B+OS3UMyUQffk+Ri9Ri
 =KIQ2
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB merge for 3.4-rc1 from Greg KH:
 "Here's the big USB merge for the 3.4-rc1 merge window.

  Lots of gadget driver reworks here, driver updates, xhci changes, some
  new drivers added, usb-serial core reworking to fix some bugs, and
  other various minor things.

  There are some patches touching arch code, but they have all been
  acked by the various arch maintainers."

* tag 'usb-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (302 commits)
  net: qmi_wwan: add support for ZTE MF820D
  USB: option: add ZTE MF820D
  usb: gadget: f_fs: Remove lock is held before freeing checks
  USB: option: make interface blacklist work again
  usb/ub: deprecate & schedule for removal the "Low Performance USB Block" driver
  USB: ohci-pxa27x: add clk_prepare/clk_unprepare calls
  USB: use generic platform driver on ath79
  USB: EHCI: Add a generic platform device driver
  USB: OHCI: Add a generic platform device driver
  USB: ftdi_sio: new PID: LUMEL PD12
  USB: ftdi_sio: add support for FT-X series devices
  USB: serial: mos7840: Fixed MCS7820 device attach problem
  usb: Don't make USB_ARCH_HAS_{XHCI,OHCI,EHCI} depend on USB_SUPPORT.
  usb gadget: fix a section mismatch when compiling g_ffs with CONFIG_USB_FUNCTIONFS_ETH
  USB: ohci-nxp: Remove i2c_write(), use smbus
  USB: ohci-nxp: Support for LPC32xx
  USB: ohci-nxp: Rename symbols from pnx4008 to nxp
  USB: OHCI-HCD: Rename ohci-pnx4008 to ohci-nxp
  usb: gadget: Kconfig: fix typo for 'different'
  usb: dwc3: pci: fix another failure path in dwc3_pci_probe()
  ...
2012-03-20 11:26:30 -07:00
Cong Wang 589973a704 drbd: remove the second argument of k[un]map_atomic()
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:29 +08:00
Cong Wang cfd8005c99 block: remove the second argument of k[un]map_atomic()
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:16 +08:00
Steven Noonan 3467811e26 xen-blkfront: make blkif_io_lock spinlock per-device
This patch moves the global blkif_io_lock to the per-device structure. The
spinlock seems to exists for two reasons: to disable IRQs when in the interrupt
handlers for blkfront, and to protect the blkfront VBDs when a detachment is
requested.

Having a global blkif_io_lock doesn't make sense given the use case, and it
drastically hinders performance due to contention. All VBDs with pending IOs
have to take the lock in order to get work done, which serializes everything
pretty badly.

Signed-off-by: Steven Noonan <snoonan@amazon.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-20 12:52:41 +01:00
Andrew Jones dad5cf659b xen/blkfront: don't put bdev right after getting it
We should hang onto bdev until we're done with it.

Signed-off-by: Andrew Jones <drjones@redhat.com>
[v1: Fixed up git commit description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-20 12:52:41 +01:00
Akinobu Mita 34ae2e47d9 xen-blkfront: use bitmap_set() and bitmap_clear()
Use bitmap_set and bitmap_clear rather than modifying individual bits
in a memory region.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-20 12:52:41 +01:00
Daniel De Graaf b2167ba6dd xen/blkback: Enable blkback on HVM guests
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-20 12:52:41 +01:00
Daniel De Graaf 4f14faaab4 xen/blkback: use grant-table.c hypercall wrappers
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-03-20 12:52:41 +01:00
Sebastian Andrzej Siewior 7396bd9fa1 usb/ub: deprecate & schedule for removal the "Low Performance USB Block" driver
Deprecate this driver. All devices which can be handled by this driver
can also be handled by the usb-storage driver.

Acked-By: Pete Zaitcev <zaitcev@redhat.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-16 13:30:10 -07:00
Stephen Rothwell ba7a4822b4 powerpc: Remove some of the legacy iSeries specific device drivers
These drivers are specific to the PowerPC legacy iSeries platform and
their Kconfig is specified in arch/powerpc.  Legacy iSeries is being
removed, so these drivers can no longer be selected.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-16 09:28:05 +11:00
Linus Torvalds f1cbd03f5e Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Been sitting on this for a while, but lets get this out the door.
  This fixes various important bugs for 3.3 final, along with a few more
  trivial ones.  Please pull!"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix ioc leak in put_io_context
  block, sx8: fix pointer math issue getting fw version
  Block: use a freezable workqueue for disk-event polling
  drivers/block/DAC960: fix -Wuninitialized warning
  drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning
  block: fix __blkdev_get and add_disk race condition
  block: Fix setting bio flags in drivers (sd_dif/floppy)
  block: Fix NULL pointer dereference in sd_revalidate_disk
  block: exit_io_context() should call elevator_exit_icq_fn()
  block: simplify ioc_release_fn()
  block: replace icq->changed with icq->flags
2012-03-14 17:16:45 -07:00
Greg Kroah-Hartman f7a0d426f3 Merge 3.3-rc7 into usb-next
This resolves the conflict with drivers/usb/host/ehci-fsl.h that
happened with changes in Linus's and this branch at the same time.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-12 09:13:31 -07:00
Muthu Kumar 9354f1b8e6 floppy/scsi: fix setting of BIO flags
Fix setting bio flags in drivers (sd_dif/floppy).

Signed-off-by: Muthukumar R <muthur@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-05 15:49:43 -08:00
Dan Carpenter ea5f4db8ec block, sx8: fix pointer math issue getting fw version
"mem" is type u8.  We need parenthesis here or it screws up the pointer
math probably leading to an oops.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-03 19:44:39 +01:00
Danny Kukawka cecd353a02 drivers/block/DAC960: fix -Wuninitialized warning
Set CommandMailbox with memset before use it. Fix for:

drivers/block/DAC960.c: In function ‘DAC960_V1_EnableMemoryMailboxInterface’:
arch/x86/include/asm/io.h:61:1: warning: ‘CommandMailbox.Bytes[12]’
 may be used uninitialized in this function [-Wuninitialized]
drivers/block/DAC960.c:1175:30: note: ‘CommandMailbox.Bytes[12]’
 was declared here

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-02 10:48:35 +01:00
Danny Kukawka bca505f109 drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning
Fixed compiler warning:

comparison between ‘DAC960_V2_IOCTL_Opcode_T’ and ‘enum <anonymous>’

Renamed enum, added a new enum for SCSI_10.CommandOpcode in
DAC960_V2_ProcessCompletedCommand().

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-02 10:48:32 +01:00
Muthukumar R 12ebffd146 block: Fix setting bio flags in drivers (sd_dif/floppy)
Fix setting bio flags in drivers (sd_dif/floppy).

Signed-off-by: Muthukumar R <muthur@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-02 10:40:58 +01:00
Sebastian Andrzej Siewior 7ac4704c09 usb/storage: a couple defines from drivers/usb/storage/transport.h to include/linux/usb/storage.h
This moves the BOT data structures for CBW and CSW from drivers internal
header file to global include able file in include/.
The storage gadget is using the same name for CSW but a different for
CBW so I fix it up properly. The same goes for the ub driver and keucr
driver in staging.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-28 11:05:18 -08:00
Hitoshi Mitake 797a796a13 asm-generic: architecture independent readq/writeq for 32bit environment
This provides unified readq()/writeq() helper functions for 32-bit
drivers.

For some cases, readq/writeq without atomicity is harmful, and order of
io access has to be specified explicitly.  So in this patch, new two
header files which contain non-atomic readq/writeq are added.

 - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/
   writeq with the order of lower address -> higher address

 - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/
   writeq with reversed order

This allows us to remove some readq()s that were added drivers when the
default non-atomic ones were removed in commit dbee8a0aff ("x86:
remove 32-bit versions of readq()/writeq()")

The drivers which need readq/writeq but can do with the non-atomic ones
must add the line:

  #include <asm-generic/io-64-nonatomic-lo-hi.h> /* or hi-lo.h */

But this will be nop in 64-bit environments, and no other #ifdefs are
required.  So I believe that this patch can solve the problem of
 1. driver-specific readq/writeq
 2. atomicity and order of io access

This patch is tested with building allyesconfig and allmodconfig as
ARCH=x86 and ARCH=i386 on top of tip/master.

Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Ravi Anand <ravi.anand@qlogic.com>
Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-21 16:47:28 -08:00
Jesper Juhl d0156f4d62 NVM Express: Remove unneeded include of linux/version.h from nvme.c
There's no need for drivers/block/nvme.c to include linux/version.h,
so remove the include.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-02-21 11:48:54 +01:00
Linus Torvalds 3ec1e88b33 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Says Jens:

 "Time to push off some of the pending items.  I really wanted to wait
  until we had the regression nailed, but alas it's not quite there yet.
  But I'm very confident that it's "just" a missing expire on exit, so
  fix from Tejun should be fairly trivial.  I'm headed out for a week on
  the slopes.

  - Killing the barrier part of mtip32xx.  It doesn't really support
    barriers, and it doesn't need them (writes are fully ordered).

  - A few fixes from Dan Carpenter, preventing overflows of integer
    multiplication.

  - A fixup for loop, fixing a previous commit that didn't quite solve
    the partial read problem from Dave Young.

  - A bio integer overflow fix from Kent Overstreet.

  - Improvement/fix of the door "keep locked" part of the cdrom shared
    code from Paolo Benzini.

  - A few cfq fixes from Shaohua Li.

  - A fix for bsg sysfs warning when removing a file it did not create
    from Stanislaw Gruszka.

  - Two fixes for floppy from Vivek, preventing a crash.

  - A few block core fixes from Tejun.  One killing the over-optimized
    ioc exit path, cleaning that up nicely.  Two others fixing an oops
    on elevator switch, due to calling into the scheduler merge check
    code without holding the queue lock."

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix lockdep warning on io_context release put_io_context()
  relay: prevent integer overflow in relay_open()
  loop: zero fill bio instead of return -EIO for partial read
  bio: don't overflow in bio_get_nr_vecs()
  floppy: Fix a crash during rmmod
  floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never called
  cdrom: move shared static to cdrom_device_info
  bsg: fix sysfs link remove warning
  block: don't call elevator callbacks for plug merges
  block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator functions
  mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the unused member of struct driver_data
  block: strip out locking optimization in put_io_context()
  cdrom: use copy_to_user() without the underscores
  block: fix ioc locking warning
  block: fix NULL icq_cache reference
  block,cfq: change code order
2012-02-11 10:07:11 -08:00
Dave Young 306df0716a loop: zero fill bio instead of return -EIO for partial read
commit 8268f5a741 ("deny partial write for loop dev fd") tried to fix the
loop device partial read information leak problem.  But it changed the
semantics of read behavior.  When we read beyond the end of the device we
should get 0 bytes, which is normal behavior, we should not just return
-EIO

Instead of returning -EIO, zero out the bio to avoid information leak in
case of partail read.

Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Dmitry Monakhov <dmonakhov@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-02-08 22:07:19 +01:00
Vivek Goyal 4609dff6b5 floppy: Fix a crash during rmmod
floppy driver does not call add_disk() on all the drives hence we don't take
gendisk reference on request queue for these drives. Don't call put_disk()
with disk->queue set, otherwise we try to put the reference we never took.

Reported-and-tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de>
Signed-off-by: Vivek Goyal<vgoyal@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-02-08 20:03:39 +01:00
Vivek Goyal 3f9a5aabd0 floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never called
add_disk() takes gendisk reference on request queue. If driver failed during
initialization and never called add_disk() then that extra reference is not
taken. That reference is put in put_disk(). floppy driver allocates the
disk, allocates queue, sets disk->queue and then relizes that floppy
controller is not present. It tries to tear down everything and tries to
put a reference down in put_disk() which was never taken.

In such error cases cleanup disk->queue before calling put_disk() so that
we never try to put down a reference which was never taken in first place.

Reported-and-tested-by: Suresh Jayaraman <sjayaraman@suse.com>
Tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-02-08 20:03:38 +01:00
Asai Thambi S P 4e8670e261 mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the unused member of struct driver_data
Removed the following:
	* irrelevant argument 'barrier' of mtip_hw_submit_io()
	* unused member 'eh_active' of struct driver_data

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-02-07 07:54:31 +01:00
Linus Torvalds 6c073a7ee2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: fix safety of rbd_put_client()
  rbd: fix a memory leak in rbd_get_client()
  ceph: create a new session lock to avoid lock inversion
  ceph: fix length validation in parse_reply_info()
  ceph: initialize client debugfs outside of monc->mutex
  ceph: change "ceph.layout" xattr to be "ceph.file.layout"
2012-02-02 15:47:33 -08:00
Alex Elder d23a4b3fd6 rbd: fix safety of rbd_put_client()
The rbd_client structure uses a kref to arrange for cleaning up and
freeing an instance when its last reference is dropped.  The cleanup
routine is rbd_client_release(), and one of the things it does is
delete the rbd_client from rbd_client_list.  It acquires node_lock
to do so, but the way it is done is still not safe.

The problem is that when attempting to reuse an existing rbd_client,
the structure found might already be in the process of getting
destroyed and cleaned up.

Here's the scenario, with "CLIENT" representing an existing
rbd_client that's involved in the race:

 Thread on CPU A                | Thread on CPU B
 ---------------                | ---------------
 rbd_put_client(CLIENT)         | rbd_get_client()
   kref_put()                   |   (acquires node_lock)
     kref->refcount becomes 0   |   __rbd_client_find() returns CLIENT
     calls rbd_client_release() |   kref_get(&CLIENT->kref);
                                |   (releases node_lock)
       (acquires node_lock)     |
       deletes CLIENT from list | ...and starts using CLIENT...
       (releases node_lock)     |
       and frees CLIENT         | <-- but CLIENT gets freed here

Fix this by having rbd_put_client() acquire node_lock.  The result
could still be improved, but at least it avoids this problem.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-02 12:56:59 -08:00
Alex Elder 97bb59a03d rbd: fix a memory leak in rbd_get_client()
If an existing rbd client is found to be suitable for use in
rbd_get_client(), the rbd_options structure is not being
freed as it should.  Fix that.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-02-02 12:49:27 -08:00
Linus Torvalds 93c3d65b28 nvme: fix merge error due to change of 'make_request_fn' fn type
The type of 'make_request_fn' changed in 5a7bbad27a ("block: remove
support for bio remapping from ->make_request"), but the merge of the
nvme driver didn't take that into account, and as a result the driver
would compile with a warning:

  drivers/block/nvme.c: In function 'nvme_alloc_ns':
  drivers/block/nvme.c:1336:2: warning: passing argument 2 of 'blk_queue_make_request' from incompatible pointer type [enabled by default]
  include/linux/blkdev.h:830:13: note: expected 'void (*)(struct request_queue *, struct bio *)' but argument is of type 'int (*)(struct request_queue *, struct bio *)'

It's benign, but the warning is annoying.

Reported-by: Stephen Rothwell <sfr@canb.auug.org>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-18 15:41:27 -08:00
Linus Torvalds 92b5abbb44 Merge git://git.infradead.org/users/willy/linux-nvme
* git://git.infradead.org/users/willy/linux-nvme: (105 commits)
  NVMe: Set number of queues correctly
  NVMe: Version 0.8
  NVMe: Set queue flags correctly
  NVMe: Simplify nvme_unmap_user_pages
  NVMe: Mark the end of the sg list
  NVMe: Fix DMA mapping for admin commands
  NVMe: Rename IO_TIMEOUT to NVME_IO_TIMEOUT
  NVMe: Merge the nvme_bio and nvme_prp data structures
  NVMe: Change nvme_completion_fn to take a dev
  NVMe: Change get_nvmeq to take a dev instead of a namespace
  NVMe: Simplify completion handling
  NVMe: Update Identify Controller data structure
  NVMe: Implement doorbell stride capability
  NVMe: Version 0.7
  NVMe: Don't probe namespace 0
  Fix calculation of number of pages in a PRP List
  NVMe: Create nvme_identify and nvme_get_features functions
  NVMe: Fix memory leak in nvme_dev_add()
  NVMe: Fix calls to dma_unmap_sg
  NVMe: Correct sg list setup in nvme_map_user_pages
  ...
2012-01-18 12:34:09 -08:00
Linus Torvalds 16008d6416 Merge branch 'for-3.3/drivers' of git://git.kernel.dk/linux-block
* 'for-3.3/drivers' of git://git.kernel.dk/linux-block:
  mtip32xx: do rebuild monitoring asynchronously
  xen-blkfront: Use kcalloc instead of kzalloc to allocate array
  mtip32xx: uninitialized variable in mtip_quiesce_io()
  mtip32xx: updates based on feedback
  xen-blkback: convert hole punching to discard request on loop devices
  xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io
  xen/blk[front|back]: Enhance discard support with secure erasing support.
  xen/blk[front|back]: Squash blkif_request_rw and blkif_request_discard together
  mtip32xx: update to new ->make_request() API
  mtip32xx: add module.h include to avoid conflict with moduleh tree
  mtip32xx: mark a few more items static
  mtip32xx: ensure that all local functions are static
  mtip32xx: cleanup compat ioctl handling
  mtip32xx: fix warnings/errors on 32-bit compiles
  block: Add driver for Micron RealSSD pcie flash cards
2012-01-15 12:48:41 -08:00
Linus Torvalds b3c9dd182e Merge branch 'for-3.3/core' of git://git.kernel.dk/linux-block
* 'for-3.3/core' of git://git.kernel.dk/linux-block: (37 commits)
  Revert "block: recursive merge requests"
  block: Stop using macro stubs for the bio data integrity calls
  blockdev: convert some macros to static inlines
  fs: remove unneeded plug in mpage_readpages()
  block: Add BLKROTATIONAL ioctl
  block: Introduce blk_set_stacking_limits function
  block: remove WARN_ON_ONCE() in exit_io_context()
  block: an exiting task should be allowed to create io_context
  block: ioc_cgroup_changed() needs to be exported
  block: recursive merge requests
  block, cfq: fix empty queue crash caused by request merge
  block, cfq: move icq creation and rq->elv.icq association to block core
  block, cfq: restructure io_cq creation path for io_context interface cleanup
  block, cfq: move io_cq exit/release to blk-ioc.c
  block, cfq: move icq cache management to block core
  block, cfq: move io_cq lookup to blk-ioc.c
  block, cfq: move cfqd->icq_list to request_queue and add request->elv.icq
  block, cfq: reorganize cfq_io_context into generic and cfq specific parts
  block: remove elevator_queue->ops
  block: reorder elevator switch sequence
  ...

Fix up conflicts in:
 - block/blk-cgroup.c
	Switch from can_attach_task to can_attach
 - block/cfq-iosched.c
	conflict with now removed cic index changes (we now use q->id instead)
2012-01-15 12:24:45 -08:00
Jens Axboe 85a0f7b220 Merge branch 'for-3.3/mtip32xx' into for-3.3/drivers 2012-01-15 10:39:35 +01:00
Paolo Bonzini 577ebb374c block: add and use scsi_blk_cmd_ioctl
Introduce a wrapper around scsi_cmd_ioctl that takes a block device.

The function will then be enhanced to detect partition block devices
and, in that case, subject the ioctls to whitelisting.

Cc: linux-scsi@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-14 15:07:24 -08:00
Linus Torvalds 0a80939b3e Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPD2aFAAoJENkgDmzRrbjxNzsQAIeYbbrXYLjr6kQzUSngj/eC
 FzjaTEfYTQIeuQCFJHcHthyc5lXV4sQbo3jOezW+Bp5yuDJL2aWIHesSfWZe7imu
 zQdM4VshOYdAmUR9Q0AW5zhB8Smbs7/AyABiF2jm4p0ZPOuyMDSlei9sjvE9Vjvt
 B7g5ht7L6kz0JbDnwwy0u5gs+tEitwpXYId9Y4ysZIBzIbL0qkPX8veOddGTMy0N
 8xhWXaKtufpjvxFD2ORLDsw3AkoF1xXSNuFd/5nzCNpbeE7TW931jfkPoqJumuAO
 7GLxcU9kKYl+IICobC6wBtsj/RrB7w+cBXMvPGwdBliam1qaRhUcJZi5FLM/Ha5d
 2A9QDYNUpoXiO8JbPXrV9Z+Y0+Co8RilsQj7R/rjZh6AbbYCWt9nxzx2Svl/RfTr
 xfiimHuB2P3rHjOvpCXULwOOuE5c8MzPuWncpdjiD3uGXOY/aY+X1m+if/quJw9D
 grPlKL0+YiRakEYUeGG4M77KCqyKFZaF7L7UQPbqfZcj8V/9AW3/7U5I/B9RlAjs
 idsr4fcf5s0N+oKUyTCW1ncpUDQNiwbU2NyJQqeu1ZxaRGj72AgyvsaNeyIPDyK+
 f6x95Bi7i8KLjXc9Z1KvJwh2Nxt25gNUiTYVha/9H2NpJGd1cfI15kTOGXrgddVv
 1pvuGcJDZwYiwfiXr3FL
 =HHrh
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://github.com/rustyrussell/linux

Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999  BFCB D920 0E6C D1AD B8F1

* tag 'for-linus' of git://github.com/rustyrussell/linux:
  module_param: check that bool parameters really are bool.
  intelfbdrv.c: bailearly is an int module_param
  paride/pcd: fix bool verbose module parameter.
  module_param: make bool parameters really bool (drivers & misc)
  module_param: make bool parameters really bool (arch)
  module_param: make bool parameters really bool (core code)
  kernel/async: remove redundant declaration.
  printk: fix unnecessary module_param_name.
  lirc_parallel: fix module parameter description.
  module_param: avoid bool abuse, add bint for special cases.
  module_param: check type correctness for module_param_array
  modpost: use linker section to generate table.
  modpost: use a table rather than a giant if/else statement.
  modules: sysfs - export: taint, coresize, initsize
  kernel/params: replace DEBUGP with pr_debug
  module: replace DEBUGP with pr_debug
  module: struct module_ref should contains long fields
  module: Fix performance regression on modules with large symbol tables
  module: Add comments describing how the "strmap" logic works

Fix up conflicts in scripts/mod/file2alias.c due to the new linker-
generated table approach to adding __mod_*_device_table entries.  The
ARM sa11x0 mcp bus needed to be converted to that too.
2012-01-14 12:32:16 -08:00
Linus Torvalds 1a52bb0b68 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: ensure prealloc_blob is in place when removing xattr
  rbd: initialize snap_rwsem in rbd_add()
  ceph: enable/disable dentry complete flags via mount option
  vfs: export symbol d_find_any_alias()
  ceph: always initialize the dentry in open_root_dentry()
  libceph: remove useless return value for osd_client __send_request()
  ceph: avoid iput() while holding spinlock in ceph_dir_fsync
  ceph: avoid useless dget/dput in encode_fh
  ceph: dereference pointer after checking for NULL
  crush: fix force for non-root TAKE
  ceph: remove unnecessary d_fsdata conditional checks
  ceph: Use kmemdup rather than duplicating its implementation

Fix up conflicts in fs/ceph/super.c (d_alloc_root() failure handling vs
always initialize the dentry in open_root_dentry)
2012-01-13 10:29:21 -08:00
Rusty Russell 1b9fbafb3a paride/pcd: fix bool verbose module parameter.
Dan Carpenter points out that it's an int, not a bool:

pcd.c:427:				if (verbose > 1)
pcd.c:433:				if (verbose > 1)
pcd.c:437:				if (verbose < 2)
pcd.c:506:#define DBMSG(msg)	((verbose>1)?(msg):NULL)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
2012-01-13 09:32:26 +10:30
Rusty Russell 90ab5ee941 module_param: make bool parameters really bool (drivers & misc)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:20 +10:30
Alex Elder 0e805a1d85 rbd: initialize snap_rwsem in rbd_add()
New rbd device structures get initialized in rbd_add().  Many of
the fields rely on being initially zero-filled.  However we lockdep
was noticing that the rw_semaphore embedded in the header field
was not getting properly initialized.  Fix that.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-01-12 11:00:50 -08:00
Amit Shah f8fb5bc23a virtio: blk: Add freeze, restore handlers to support S4
Delete the vq and flush any pending requests from the block queue on the
freeze callback to prepare for hibernation.

Re-create the vq in the restore callback to resume normal function.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-12 15:44:45 +10:30
Amit Shah 6abd6e5a44 virtio: blk: Move vq initialization to separate function
The probe and PM restore functions will share this code.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-12 15:44:45 +10:30
Michael S. Tsirkin 4678d6f970 virtio_blk: fix config handler race
Fix a theoretical race related to config work
handler: a config interrupt might happen
after we flush config work but before we
reset the device. It will then cause the
config work to run during or after reset.

Two problems with this:
- if this runs after device is gone we will get use after free
- access of config while reset is in progress is racy
(as layout is changing).

As a solution
1. flush after reset when we know there will be no more interrupts
2. add a flag to disable config access before reset

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-12 15:44:44 +10:30
Rusty Russell f96fde41f7 virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf
Remove wrapper functions. This makes the allocation type explicit in
all callers; I used GPF_KERNEL where it seemed obvious, left it at
GFP_ATOMIC otherwise.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2012-01-12 15:44:42 +10:30
Matthew Wilcox df34813990 NVMe: Set number of queues correctly
The number of submission & completion queues should be set by calling
Set Features, not Get Features.

Reported-by: Kwok Kong <Kwok.Kong@idt.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-11 09:22:24 -05:00
Linus Torvalds 4690dfa8cd Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Wire-up new system calls
  microblaze: Remove NO_IRQ from architecture
  input: xilinx_ps2: Don't use NO_IRQ
  block: xsysace: Don't use NO_IRQ
  microblaze: Trivial asm fix
  microblaze: Fix debug message in module
  microblaze: Remove eprintk macro
  microblaze: Send CR before LF for early console
  microblaze: Change NO_IRQ to 0
  microblaze: Use irq_of_parse_and_map for timer
  microblaze: intc: Change variable name
  microblaze: Use of_find_compatible_node for timer and intc
  microblaze: Add __cmpdi2
  microblaze: Synchronize __pa __va macros
2012-01-10 17:37:49 -08:00
Matthew Wilcox 366e8217e5 NVMe: Version 0.8
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 16:30:15 -05:00
Matthew Wilcox 4eeb9215a0 NVMe: Set queue flags correctly
QUEUE_FLAG_* are flags (other than QUEUE_FLAG_DEFAULT), so they cannot
be ORed together.  Set the queue flags using queue_flag_set_unlocked().

Reported-by: Donald Wood <donald.e.wood@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 16:29:23 -05:00
Matthew Wilcox 1c2ad9faaf NVMe: Simplify nvme_unmap_user_pages
By using the iod->nents field (the same way other I/O paths do), we can
avoid recalculating the number of sg entries at unmap time, and make
nvme_unmap_user_pages() easier to call.

Also, use the 'write' parameter instead of assuming DMA_FROM_DEVICE.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:54:22 -05:00
Matthew Wilcox fe304c43c6 NVMe: Mark the end of the sg list
For user I/O and admin commands, we were forgetting to mark the end of
the SG list.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:54:14 -05:00
Matthew Wilcox 497421880a NVMe: Fix DMA mapping for admin commands
We were always mapping as DMA_FROM_DEVICE then unmapping with
DMA_TO_DEVICE which was clearly not correct.  Follow the same pattern as
nvme_submit_io() and key off the bottom bit of the opcode to determine
whether this is a read or a write.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:54:05 -05:00
Matthew Wilcox ff976d724a NVMe: Rename IO_TIMEOUT to NVME_IO_TIMEOUT
IO_TIMEOUT is a little too generic and might be used by other parts of
the kernel in the future.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:53:54 -05:00
Matthew Wilcox eca18b2394 NVMe: Merge the nvme_bio and nvme_prp data structures
The new merged data structure is called nvme_iod.  This improves performance
for mid-sized I/Os (in the 16k range) since we save a memory allocation.
It is also a slightly simpler interface to use.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:51:20 -05:00
Matthew Wilcox 5c1281a3bf NVMe: Change nvme_completion_fn to take a dev
The queue is only needed for some rare occasions, and it's more consistent
to pass the device around.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:51:00 -05:00
Matthew Wilcox 040a93b52a NVMe: Change get_nvmeq to take a dev instead of a namespace
Upcoming patches require calling get_nvmeq when we don't have a namespace.
Some callers already have the device in a local variable anyway.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:49:18 -05:00
Matthew Wilcox c2f5b65020 NVMe: Simplify completion handling
Instead of encoding the handler type in the bottom two bits of the
per-completion context pointer, store the handler function as well
as the context pointer.  This gives us more flexibility and the code
is clearer.  It comes at the cost of an extra 8k of memory per queue,
but this feels like a reasonable price to pay.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2012-01-10 14:47:46 -05:00
Linus Torvalds 90160371b3 Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits)
  xen/pciback: Expand the warning message to include domain id.
  xen/pciback: Fix "device has been assigned to X domain!" warning
  xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un|]bind"
  xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
  xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
  xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
  Xen: consolidate and simplify struct xenbus_driver instantiation
  xen-gntalloc: introduce missing kfree
  xen/xenbus: Fix compile error - missing header for xen_initial_domain()
  xen/netback: Enable netback on HVM guests
  xen/grant-table: Support mappings required by blkback
  xenbus: Use grant-table wrapper functions
  xenbus: Support HVM backends
  xen/xenbus-frontend: Fix compile error with randconfig
  xen/xenbus-frontend: Make error message more clear
  xen/privcmd: Remove unused support for arch specific privcmp mmap
  xen: Add xenbus_backend device
  xen: Add xenbus device driver
  xen: Add privcmd device driver
  xen/gntalloc: fix reference counts on multi-page mappings
  ...
2012-01-10 10:09:59 -08:00
Linus Torvalds 98793265b4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
  Kconfig: acpi: Fix typo in comment.
  misc latin1 to utf8 conversions
  devres: Fix a typo in devm_kfree comment
  btrfs: free-space-cache.c: remove extra semicolon.
  fat: Spelling s/obsolate/obsolete/g
  SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
  tools/power turbostat: update fields in manpage
  mac80211: drop spelling fix
  types.h: fix comment spelling for 'architectures'
  typo fixes: aera -> area, exntension -> extension
  devices.txt: Fix typo of 'VMware'.
  sis900: Fix enum typo 'sis900_rx_bufer_status'
  decompress_bunzip2: remove invalid vi modeline
  treewide: Fix comment and string typo 'bufer'
  hyper-v: Update MAINTAINERS
  treewide: Fix typos in various parts of the kernel, and fix some comments.
  clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
  gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
  leds: Kconfig: Fix typo 'D2NET_V2'
  sound: Kconfig: drop unknown symbol ARCH_CLPS7500
  ...

Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
kconfig additions, close to removed commented-out old ones)
2012-01-08 13:21:22 -08:00
Linus Torvalds 972b2c7199 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
  reiserfs: Properly display mount options in /proc/mounts
  vfs: prevent remount read-only if pending removes
  vfs: count unlinked inodes
  vfs: protect remounting superblock read-only
  vfs: keep list of mounts for each superblock
  vfs: switch ->show_options() to struct dentry *
  vfs: switch ->show_path() to struct dentry *
  vfs: switch ->show_devname() to struct dentry *
  vfs: switch ->show_stats to struct dentry *
  switch security_path_chmod() to struct path *
  vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
  vfs: trim includes a bit
  switch mnt_namespace ->root to struct mount
  vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
  vfs: opencode mntget() mnt_set_mountpoint()
  vfs: spread struct mount - remaining argument of next_mnt()
  vfs: move fsnotify junk to struct mount
  vfs: move mnt_devname
  vfs: move mnt_list to struct mount
  vfs: switch pnode.h macros to struct mount *
  ...
2012-01-08 12:19:57 -08:00
Al Viro ece2ccb668 Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
Linus Torvalds 356b95424c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: (21 commits)
  m68k/mac: Make CONFIG_HEARTBEAT unavailable on Mac
  m68k/serial: Remove references to obsolete serial config options
  m68k/net: Remove obsolete IRQ_FLG_* users
  m68k: Don't comment out syscalls used by glibc
  m68k/atari: Move declaration of atari_SCC_reset_done to header file
  m68k/serial: Remove references to obsolete CONFIG_SERIAL167
  m68k/hp300: Export hp300_ledstate
  m68k: Initconst section fixes
  m68k/mac: cleanup macro case
  mac_scsi: fix mac_scsi on some powerbooks
  m68k/mac: fix powerbook 150 adb_type
  m68k/mac: fix baboon irq disable and shutdown
  m68k/mac: oss irq fixes
  m68k/mac: fix nubus slot irq disable and shutdown
  m68k/mac: enable via_alt_mapping on performa 580
  m68k/mac: cleanup forward declarations
  m68k/mac: cleanup mac_irq_pending
  m68k/mac: cleanup mac_clear_irq
  m68k/mac: early console
  m68k/mvme16x: Add support for EARLY_PRINTK
  ...

Fix up trivial conflict in arch/m68k/Kconfig.debug due to new
EARLY_PRINTK config option addition clashing with movement of the
BOOTPARAM options.
2012-01-06 18:28:12 -08:00
Michal Simek ba2d5affde block: xsysace: Don't use NO_IRQ
Drivers shouldn't use NO_IRQ. Microblaze and PPC
define NO_IRQ as 0 and this reference will be removed
in near future.

Signed-off-by: Michal Simek <monstr@monstr.eu>
Reviewed-by: Ryan Mallon <rmallon@gmail.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
CC: Rob Herring <rob.herring@calxeda.com>
2012-01-05 08:34:29 +01:00
Jan Beulich 73db144b58 Xen: consolidate and simplify struct xenbus_driver instantiation
The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.

Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).

Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.

v2: Restore DRV_NAME for the driver name in xen-pciback.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-01-04 17:01:17 -05:00
Asai Thambi S P 62ee8c13e2 mtip32xx: do rebuild monitoring asynchronously
Earlier, rebuild monitoring was done in the context of probe. Now the service
thread takes the responsibility of rebuild monitoring, and probe returns good
status.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-01-04 22:01:32 +01:00
Al Viro 2c9ede55ec switch device_get_devnode() and ->devnode() to umode_t *
both callers of device_get_devnode() are only interested in lower 16bits
and nobody tries to return anything wider than 16bit anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:55 -05:00
Al Viro ff01bb4832 fs: move code out of buffer.c
Move invalidate_bdev, block_sync_page into fs/block_dev.c.  Export
kill_bdev as well, so brd doesn't have to open code it.  Reduce
buffer_head.h requirement accordingly.

Removed a rather large comment from invalidate_bdev, as it looked a bit
obsolete to bother moving.  The small comment replacing it says enough.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:07 -05:00
Jens Axboe f748040bb8 Merge branch 'stable/for-jens-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.3/drivers 2011-12-25 16:46:46 +01:00
Linus Torvalds b0d78ee89c Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  block: don't kick empty queue in blk_drain_queue()
  block/swim3: Locking fixes
  loop: Fix discard_alignment default setting
  cfq-iosched: fix cfq_cic_link() race confition
  cfq-iosched: free cic_index if blkio_alloc_blkg_stats fails
  cciss: fix flush cache transfer length
  cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler
  loop: fix loop block driver discard and encryption comment
  block: initialize request_queue's numa node during
2011-12-16 10:05:14 -08:00
Thomas Meyer f094148a17 xen-blkfront: Use kcalloc instead of kzalloc to allocate array
The advantage of kcalloc is, that will prevent integer overflows which could
result from the multiplication of number of elements and size and it is also
a bit nicer to read.

The semantic patch that makes this change is available
in https://lkml.org/lkml/2011/11/25/107

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
[v1: Seperated the drivers/block/cciss_scsi.c out of this patch]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-12-16 12:36:52 -05:00
Tejun Heo 1ba64edef6 block, sx8: kill blk_insert_request()
The only user left for blk_insert_request() is sx8 and it can be
trivially switched to use blk_execute_rq_nowait() - special requests
aren't included in io stat and sx8 doesn't use block layer tagging.
Switch sx8 and kill blk_insert_requeset().

This patch doesn't introduce any functional difference.

Only compile tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-12-14 00:33:37 +01:00
Linus Torvalds 653f42f6b6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: add missing spin_unlock at ceph_mdsc_build_path()
  ceph: fix SEEK_CUR, SEEK_SET regression
  crush: fix mapping calculation when force argument doesn't exist
  ceph: use i_ceph_lock instead of i_lock
  rbd: remove buggy rollback functionality
  rbd: return an error when an invalid header is read
  ceph: fix rasize reporting by ceph_show_options
2011-12-13 14:59:42 -08:00
Benjamin Herrenschmidt b302545744 block/swim3: Locking fixes
The old PowerMac swim3 driver has some "interesting" locking issues,
using a private lock and failing to lock the queue before completing
requests, which triggered WARN_ONs among others.

This rips out the private lock, makes everything operate under the
block queue lock, and generally makes things simpler.

We used to also share a queue between the two possible instances which
was problematic since we might pick the wrong controller in some cases,
so make the queue and the current request per-instance and use
queuedata to point to our private data which is a lot cleaner.

We still share the queue lock but then, it's nearly impossible to actually
use 2 swim3's simultaneously: one would need to have a Wallstreet
PowerBook, the only machine afaik with two of these on the motherboard,
and populate both hotswap bays with a floppy drive (the machine ships
only with one), so nobody cares...

While at it, add a little fix to clear up stale interrupts when loading
the driver or plugging a floppy drive in a bay.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-12-12 12:42:12 +01:00
Finn Thain ed04c97d51 m68k/mac: cleanup forward declarations
Move some forward declarations into header files and adjust includes.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2011-12-10 19:52:46 +01:00
Josh Durgin 51703306b3 rbd: remove buggy rollback functionality
This doesn't interact with resizing well, since it doesn't set the
size of the device to the size at the snapshot. It's also an expensive
operation to be synchronous. Rollback can still be done with the
userspace rbd tool.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-07 10:46:19 -08:00
Josh Durgin 81e759fbf7 rbd: return an error when an invalid header is read
This protects against opening future rbd images that have incompatible format changes.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-07 10:46:10 -08:00
Justin P. Mattock 42b2aa86c6 treewide: Fix typos in various parts of the kernel, and fix some comments.
The below patch fixes some typos in various parts of the kernel, as well as fixes some comments.
Please let me know if I missed anything, and I will try to get it changed and resent.

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-12-02 14:57:31 +01:00
Lukas Czerner dfaf3c036c loop: Fix discard_alignment default setting
discard_alignment is not relevant to the loop driver since it is
supposed to be set as a workaround for the old sector 63 alignments.
So set it to zero rather than block size.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reported-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-12-02 14:47:03 +01:00
Stephen M. Cameron 59bd71a81b cciss: fix flush cache transfer length
We weren't filling in the transfer length of the
flush cache command (it transfers 4 bytes of zeroes).
Firmware didn't seem to be bothered by this, but it
should be fixed.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-28 20:12:05 +01:00
Stephen M. Cameron 6225da4815 cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler
IRQF_SHARED is required for older controllers that don't support MSI(X)
and which may end up sharing an interrupt.

Also remove deprecated IRQF_DISABLED.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-28 20:12:05 +01:00
Dave Young ae95757a90 loop: fix loop block driver discard and encryption comment
The loop driver does not support discard if encryption is enabled,
fix the comment.

Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-25 09:41:25 +01:00
Dan Carpenter 3e54a3d1b8 mtip32xx: uninitialized variable in mtip_quiesce_io()
We recently introduce new continue in the loop which make gcc complain.
In theory if MTIP_FLAG_SVC_THD_ACTIVE_BIT is set, we could hit continue
over and over until eventually we time out of the loop.  In that case
"active" should be set as true, but right now it's uninitialized.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-24 12:59:00 +01:00
Asai Thambi S P 60ec0eecfa mtip32xx: updates based on feedback
* queue ncq commands when a non-ncq is in progress or error handling is active
* merge variables 'internal_cmd_in_progress' and 'eh_active' into new variable 'flags'
* get rid of read/write semaphore 'internal_sem'
* new service thread to issue queued commands
* use macros from ata.h for command codes
* return ENOTTY for BLKFLSBUF ioctl
* style changes

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-23 08:29:24 +01:00
Li Dongyang ae18be11b5 xen-blkback: convert hole punching to discard request on loop devices
As of dfaa2ef68e, loop devices support
discard request now. We could just issue a discard request, and
the loop driver will punch the hole for us, so we don't need to touch
the internals of loop device and punch the hole ourselves, Thanks.

V0->V1: rebased on devel/for-jens-3.3

Signed-off-by: Li Dongyang <lidongyang@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-18 13:28:05 -05:00
Konrad Rzeszutek Wilk 421463526f xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io
.. and move it to its own function that will deal with the
discard operation.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-18 13:28:03 -05:00
Konrad Rzeszutek Wilk 5ea4298669 xen/blk[front|back]: Enhance discard support with secure erasing support.
Part of the blkdev_issue_discard(xx) operation is that it can also
issue a secure discard operation that will permanantly remove the
sectors in question. We advertise that we can support that via the
'discard-secure' attribute and on the request, if the 'secure' bit
is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE.

CC: Li Dongyang <lidongyang@novell.com>
[v1: Used 'flag' instead of 'secure:1' bit]
[v2: Use 'reserved' uint8_t instead of adding a new value]
[v3: Check for nseg when mapping instead of operation]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-18 13:28:01 -05:00
Konrad Rzeszutek Wilk 97e36834f5 xen/blk[front|back]: Squash blkif_request_rw and blkif_request_discard together
In a union type structure to deal with the overlapping
attributes in a easier manner.

Suggested-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-18 13:27:59 -05:00
Dan Carpenter a2c2a0e668 paride: fix potential information leak in pg_read()
Smatch has a new check for Rosenberg type information leaks where structs
are copied to the user with uninitialized stack data in them.  i In this
case, the pg_write_hdr struct has a hole in it.

struct pg_write_hdr {
        char                       magic;                /*     0     1 */
        char                       func;                 /*     1     1 */
        /* XXX 2 bytes hole, try to pack */
        int                        dlen;                 /*     4     4 */

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Tim Waugh <tim@cyberelk.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-16 09:21:50 +01:00
Stephen M. Cameron 0007a4c90a cciss: auto engage SCSI mid layer at driver load time
A long time ago, probably in 2002, one of the distros, or maybe more than
one, loaded block drivers prior to loading the SCSI mid layer.  This meant
that the cciss driver, being a block driver, could not engage the SCSI mid
layer at init time without panicking, and relied on being poked by a
userland program after the system was up (and the SCSI mid layer was
therefore present) to engage the SCSI mid layer.

This is no longer the case, and cciss can safely rely on the SCSI mid
layer being present at init time and engage the SCSI mid layer straight
away.  This means that users will see their tape drives and medium
changers at driver load time without need for a script in /etc/rc.d that
does this:

for x in /proc/driver/cciss/cciss*
do
	echo "engage scsi" > $x
done

However, if no tape drives or medium changers are detected, the SCSI mid
layer will not be engaged.  If a tape drive or medium change is later
hot-added to the system it will then be necessary to use the above script
or similar for the device(s) to be acceesible.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-16 09:21:49 +01:00
Dmitry Monakhov 7035b5df3c loop: cleanup set_status interface
1) Anyone who has read access to loopdev has permission to call set_status
   and may change important parameters such as lo_offset, lo_sizelimit and
   so on, which contradicts to read access pattern and definitely equals
   to write access pattern.
2) Add lo_offset over i_size check to prevent blkdev_size overflow.
   ##Testcase_bagin
   #dd if=/dev/zero of=./file bs=1k count=1
   #losetup /dev/loop0 ./file
   /* userspace_application */
   struct loop_info64 loinf;
   fd = open("/dev/loop0", O_RDONLY);
   ioctl(fd, LOOP_GET_STATUS64, &loinf);
   /* Set offset to any value which is bigger than i_size, and sizelimit
    * to nonzero value*/
   loinf.lo_offset = 4096*1024;
   loinf.lo_sizelimit = 1024;
   ioctl(fd, LOOP_SET_STATUS64, &loinf);
   /* After this loop device will have size similar to 0x7fffffffffxxxx */
   #blockdev --getsz /dev/loop0
   ##OUTPUT: 36028797018955968
   ##Testcase_end

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-16 09:21:49 +01:00
Dmitry Monakhov 3bb9068278 loop: prevent information leak after failed read
If read was not fully successful we have to fail whole bio to prevent
information leak of old pages

##Testcase_begin
dd if=/dev/zero of=./file bs=1M count=1
losetup /dev/loop0 ./file -o 4096
truncate -s 0 ./file
# OOps loop offset is now beyond i_size, so read will silently fail.
# So bio's pages would not be cleared, may which result in information leak.
hexdump -C /dev/loop0
##testcase_end

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-16 09:21:48 +01:00
Matthew Garrett 1937335856 The Windows driver .inf disables ASPM on all cciss devices. Do the same.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: iss_storagedev@hp.com
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-11 22:05:54 +01:00
Linus Torvalds 32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Linus Torvalds 06d381484f Merge branch 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  net: xen-netback: use API provided by xenbus module to map rings
  block: xen-blkback: use API provided by xenbus module to map rings
  xen: use generic functions instead of xen_{alloc, free}_vm_area()
2011-11-06 18:31:36 -08:00
Jens Axboe a71f483d79 mtip32xx: update to new ->make_request() API
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-05 08:36:21 +01:00
Jens Axboe 0e838c624e mtip32xx: add module.h include to avoid conflict with moduleh tree
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-05 08:35:10 +01:00
Jens Axboe 3ff147d3a8 mtip32xx: mark a few more items static
Missed two items: mtip_major, and mtip_pci_driver.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-05 08:35:10 +01:00
Jens Axboe 6316668fbc mtip32xx: ensure that all local functions are static
Kill the declarations in the header file and mark them as static.
Reshuffle a few functions to ensure that everything is properly
declared before being used.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-05 08:35:10 +01:00
Jens Axboe ef0f158734 mtip32xx: cleanup compat ioctl handling
Do the conversion/copy up front instead of passing in a compat flag
to the ioctl handler and subsequently to the exec_drive_taskfile()
function.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-05 08:35:10 +01:00
Jens Axboe 16d02c040b mtip32xx: fix warnings/errors on 32-bit compiles
We need to clean up the compat ioctl handling, but this makes it
work for now at least.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-11-05 08:35:10 +01:00
Sam Bradshaw 88523a6155 block: Add driver for Micron RealSSD pcie flash cards
This adds mtip32xx, a driver supporting Microns line of
pci-express flash storage cards.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-11-05 08:35:10 +01:00
Linus Torvalds 3d0a8d10cf Merge branch 'for-3.2/drivers' of git://git.kernel.dk/linux-block
* 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits)
  virtio-blk: use ida to allocate disk index
  hpsa: add small delay when using PCI Power Management to reset for kump
  cciss: add small delay when using PCI Power Management to reset for kump
  xen/blkback: Fix two races in the handling of barrier requests.
  xen/blkback: Check for proper operation.
  xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
  xen/blkback: Report VBD_WSECT (wr_sect) properly.
  xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests.
  xen-blkfront: plug device number leak in xlblk_init() error path
  xen-blkfront: If no barrier or flush is supported, use invalid operation.
  xen-blkback: use kzalloc() in favor of kmalloc()+memset()
  xen-blkback: fixed indentation and comments
  xen-blkfront: fix a deadlock while handling discard response
  xen-blkfront: Handle discard requests.
  xen-blkback: Implement discard requests ('feature-discard')
  xen-blkfront: add BLKIF_OP_DISCARD and discard request struct
  drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd()
  drivers/block/loop.c: emit uevent on auto release
  drivers/block/cpqarray.c: use pci_dev->revision
  loop: always allow userspace partitions and optionally support automatic scanning
  ...

Fic up trivial header file includsion conflict in drivers/block/loop.c
2011-11-04 17:22:14 -07:00
Linus Torvalds b4fdcb02f1 Merge branch 'for-3.2/core' of git://git.kernel.dk/linux-block
* 'for-3.2/core' of git://git.kernel.dk/linux-block: (29 commits)
  block: don't call blk_drain_queue() if elevator is not up
  blk-throttle: use queue_is_locked() instead of lockdep_is_held()
  blk-throttle: Take blkcg->lock while traversing blkcg->policy_list
  blk-throttle: Free up policy node associated with deleted rule
  block: warn if tag is greater than real_max_depth.
  block: make gendisk hold a reference to its queue
  blk-flush: move the queue kick into
  blk-flush: fix invalid BUG_ON in blk_insert_flush
  block: Remove the control of complete cpu from bio.
  block: fix a typo in the blk-cgroup.h file
  block: initialize the bounce pool if high memory may be added later
  block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown
  block: drop @tsk from attempt_plug_merge() and explain sync rules
  block: make get_request[_wait]() fail if queue is dead
  block: reorganize throtl_get_tg() and blk_throtl_bio()
  block: reorganize queue draining
  block: drop unnecessary blk_get/put_queue() in scsi_cmd_ioctl() and blk_get_tg()
  block: pass around REQ_* flags instead of broken down booleans during request alloc/free
  block: move blk_throtl prototypes to block/blk.h
  block: fix genhd refcounting in blkio_policy_parse_and_set()
  ...

Fix up trivial conflicts due to "mddev_t" -> "struct mddev" conversion
and making the request functions be of type "void" instead of "int" in
 - drivers/md/{faulty.c,linear.c,md.c,md.h,multipath.c,raid0.c,raid1.c,raid10.c,raid5.c}
 - drivers/staging/zram/zram_drv.c
2011-11-04 17:06:58 -07:00
Matthew Wilcox f1938f6e1e NVMe: Implement doorbell stride capability
The doorbell stride allows devices to spread out their doorbells instead
of packing them tightly.  This feature was added as part of ECN 003.

This patch also enables support for more than 512 queues :-)

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:05 -04:00
Matthew Wilcox ce38c14957 NVMe: Version 0.7
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:05 -04:00
Matthew Wilcox 2b2c189687 NVMe: Don't probe namespace 0
ECN 001 documented that namespace 0 is not valid.  Sending an Identify
with CNS of 0 and Namespace of 0 is an undefined command.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:04 -04:00
Nisheeth Bhat 0d1bc91258 Fix calculation of number of pages in a PRP List
The existing calculation underestimated the number of pages required
as it did not take into account the pointer at the end of each page.
The replacement calculation may overestimate the number of pages required
if the last page in the PRP List is entirely full.  By using ->npages
as a counter as we fill in the pages, we ensure that we don't try to
free a page that was never allocated.

Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:04 -04:00
Matthew Wilcox bc5fc7e4b2 NVMe: Create nvme_identify and nvme_get_features functions
Instead of open-coding calls to nvme_submit_admin_cmd, these
small wrappers are simpler to use (the patch removes 14 lines from
nvme_dev_add() for example).

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:04 -04:00
Matthew Wilcox 684f5c2025 NVMe: Fix memory leak in nvme_dev_add()
The driver was allocating 8k of memory, then freeing 4k of it.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:04 -04:00
Nisheeth Bhat d1a490e026 NVMe: Fix calls to dma_unmap_sg
dma_unmap_sg() must be called with the same 'nents' passed to
dma_map_sg(), not the number returned from dma_map_sg().

Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:04 -04:00
Matthew Wilcox d0ba1e497b NVMe: Correct sg list setup in nvme_map_user_pages
Our SG list was constructed to always fill the entire first page, even
if that was more than the length of the I/O.  This is probably harmless,
but some IOMMUs might do something bad.

Correcting the first call to sg_set_page() made it look a lot closer to
the sg_set_page() in the loop, so fold the first call to sg_set_page()
into the loop.

Reported-by: Nisheeth Bhat <nisheeth.bhat@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2011-11-04 15:53:04 -04:00
Matthew Wilcox 6413214c5d Fix bug in NVME_IOCTL_SUBMIT_IO
Missing 'break' in the switch statement meant that we'd fall through
to the 'return -EINVAL' case.
2011-11-04 15:53:04 -04:00
Matthew Wilcox 6bbf1acdde NVMe: Rework ioctls
Remove the special-purpose IDENTIFY, GET_RANGE_TYPE, DOWNLOAD_FIRMWARE
and ACTIVATE_FIRMWARE commands.  Replace them with a generic ADMIN_CMD
ioctl that can submit any admin command.

Add a new ID ioctl that returns the namespace ID of the queried device.
It corresponds to the SCSI Idlun ioctl.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:03 -04:00
Matthew Wilcox eac623ba7a NVMe: Add the nvme thread to the wait queue before waking it up
If the I/O was not completed by a single NVMe command, we add the
bio to the congestion list and wake up the kthread to resubmit it.
But the kthread calls remove_wait_queue() unconditionally, which
will oops if it's not on the wait queue.  So add the kthread to
the wait queue before waking it up.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:03 -04:00
Matthew Wilcox 6f0f54499f NVMe: Return real error from nvme_create_queue
nvme_setup_io_queues() was assuming that a NULL return from
nvme_create_queue() was an out-of-memory error.  That's not necessarily
true; the adapter might return -EIO, for example.  Change the calling
convention to return an ERR_PTR on failure instead of NULL.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:03 -04:00
Matthew Wilcox be5e094840 NVMe: Version 0.6
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:03 -04:00
Matthew Wilcox 184d2944cb NVMe: Add a few calling convention notes
For the benefit of reviewers, add comments to a few functions describing
their calling context

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:03 -04:00
Matthew Wilcox b77954cbdd NVMe: Handle failures from memory allocations in nvme_setup_prps
If any of the memory allocations in nvme_setup_prps fail, handle it by
modifying the passed-in data length to reflect the number of bytes we are
actually able to send.  Also allow the caller to specify the GFP flags
they need; for user-initiated commands, we can use GFP_KERNEL allocations.

The various callers are updated to handle this possibility; the main
I/O path is already prepared for this possibility (as it may happen
due to nvme_map_bio being unable to map all the segments of the I/O).
The other callers return -ENOMEM instead of doing partial I/Os.

Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:03 -04:00
Matthew Wilcox 5aff9382dd NVMe: Use an IDA to allocate minor numbers
The current approach of using the namespace ID as the minor number
doesn't work when there are multiple adapters in the machine.  Rather
than statically partitioning the number of namespaces between adapters,
dynamically allocate minor numbers to namespaces as they are detected.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:03 -04:00
Matthew Wilcox fd63e9ceee NVMe: Add include of delay.h for msleep
Previously it was being implicitly included through some other header file

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:02 -04:00
Matthew Wilcox 8de055350f NVMe: Add support for timing out I/Os
In the kthread, walk the list of outstanding I/Os and check they've not
hit the timeout.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:02 -04:00
Matthew Wilcox 21075bdee0 NVMe: Rename cancel_cmdid_data to cancel_cmdid
The trailing '_data' on the end was annoying and inconsistent.  Also, make
it actually return the data since this is needed for timing out commands.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:02 -04:00
Matthew Wilcox 09a58f5364 NVMe: Fix bug in error handling
When an I/O completed with an error, we would call bio_endio twice
(once with -EIO and once with 0).  Found by inspection.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:02 -04:00
Matthew Wilcox 22605f9681 NVMe: Time out initialisation after a few seconds
THe device reports (in its capability register) how long it will take
to initialise.  If that time elapses before the ready bit becomes set,
conclude the device is broken and refuse to initialise it.  Log a nice
error message so the user knows why we did nothing.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:02 -04:00
Matthew Wilcox aba2080f3f NVMe: Fix warning in free_irq
We need to clear the affinity mask before calling free_irq()

Reported-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:02 -04:00
Matthew Wilcox 7f53f9d242 NVMe: Correct the Controller Configuration settings
The arbitration field was extended by one bit, shifting the shutdown
notification bits by one.  Also, the SQ/CQ entry size was made
configurable for future extensions.

Reported-by: Paul Luse <paul.e.luse@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:01 -04:00
Matthew Wilcox 8ef700678f NVMe: Version 0.5
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:01 -04:00
Matthew Wilcox 6c7d49455c NVMe: Change the definition of nvme_user_io
The read and write commands don't define a 'result', so there's no need
to copy it back to userspace.

Remove the ability of the ioctl to submit commands to a different
namespace; it's just asking for trouble, and the use case I have in mind
will be addressed througha  different ioctl in the future.  That removes
the need for both the block_shift and nsid arguments.

Check that the opcode is one of 'read' or 'write'.  Future opcodes may
be added in the future, but we will need a different structure definition
for them.

The nblocks field is redefined to be 0-based.  This allows the user to
request the full 65536 blocks.

Don't byteswap the reftag, apptag and appmask.  Martin Petersen tells
me these are calculated in big-endian and are transmitted to the device
in big-endian.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:01 -04:00
Matthew Wilcox 4948168280 NVMe: Add compat_ioctl
Make ioctls work for 32-bit applications on 64-bit kernels.  The structures
are defined to be the same for both 32- and 64-bit applications, so
we can use the same handler for both.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:01 -04:00
Matthew Wilcox 9ecdc94621 NVMe: Simplify queue lookup
Fill in all the num_possible_cpus() entries with duplicate pointers.
This reduces the complexity of the frequently-called get_nvmeq(), as
well as avoiding a bug in it when there are fewer queues than CPUs.

Reported-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:01 -04:00
Matthew Wilcox 3cb967c039 NVMe: Remove the kthread from the wait queue
Once there are no more bios on the congestion list, we can stop waking
up the nvme kthread every time a completion happens.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:00 -04:00
Matthew Wilcox 7523d834dd NVMe: Fix off-by-one when filling in PRP lists
If the last element in the PRP list fits on the end of the page, there's
no need to allocate an extra page to put that single element in.  It can
fit on the end of the page.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:00 -04:00
Matthew Wilcox ac88c36a38 NVMe: Fix interpretation of 'Number of Namespaces' field
The spec says this is a 0s based value.  We don't need to handle the
maximal value because it's reserved to mean "every namespace".

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:00 -04:00
Matthew Wilcox 19e899b2f9 NVMe: Remove outdated comments
The head can never overrun the tail since we won't allocate enough command
IDs to let that happen.  The status codes are in sync with the spec.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:00 -04:00
Matthew Wilcox fa92282149 NVMe: Fix comment formatting
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:00 -04:00
Matthew Wilcox 714a7a2288 NVMe: Convert comments to kernel-doc notation
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:53:00 -04:00
Matthew Wilcox b57ab0fada NVMe: Version 0.4
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:59 -04:00
Matthew Wilcox e6d15f79f9 NVMe: Reduce maximum queue depth by 1
The spec says we're not allowed to completely fill the submission queue.
Solve this by reducing the number of allocatable cmdids by 1.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:59 -04:00
Matthew Wilcox d8ee9d69f2 NVMe: Fix discontiguous accesses
When we submit subsequent portions of the I/O, we need to access the
updated block, not start reading again from the original position.
This was showing up as miscompares in the XFS randholes testcase.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:59 -04:00
Matthew Wilcox 1ad2f8932a NVMe: Handle bios that contain non-virtually contiguous addresses
NVMe scatterlists must be virtually contiguous, like almost all I/Os.
However, when the filesystem lays out files with a hole, it can be that
adjacent LBAs map to non-adjacent virtual addresses.  Handle this by
submitting one NVMe command at a time for each virtually discontiguous
range.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:59 -04:00
Matthew Wilcox 00df5cb4eb NVMe: Implement Flush
Linux implements Flush as a bit in the bio.  That means there may also be
data associated with the flush; if so the flush should be sent before the
data.  To avoid completing the bio twice, I add CMD_CTX_FLUSH to indicate
the completion routine should do nothing.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:59 -04:00
Matthew Wilcox c42705592b NVMe: Mark CMD_CTX_CANCELLED as being unlikely
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:59 -04:00
Matthew Wilcox 7547881d09 NVMe: Correct SQ doorbell semantics
The value written to the doorbell needs to be the first free index in
the queue, not the most recently used index in the queue.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:58 -04:00
Matthew Wilcox 740216fc59 NVMe: Let the kthread take care of devices earlier
If interrupts are misconfigured, the kthread will be needed to process
admin queue completions.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:58 -04:00
Matthew Wilcox b348b7d543 NVMe: Rename nr_queues to nr_io_queues
I got confused about whether this included the admin queue or not, and
had to resort to reading the spec.  It doesn't include the admin queue,
so make that clear in the name.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:58 -04:00
Matthew Wilcox ca1615424c NVMe: Remove setting of 'flags' in rw command
This was the data transfer bit until spec rev 0.92

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:58 -04:00
Matthew Wilcox ad8a5df97c NVMe: Release 0.3
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:58 -04:00
Matthew Wilcox 1fa6aeadf1 NVMe: Add a kthread to handle the congestion list
Instead of trying to resubmit I/Os in the I/O completion path (in
interrupt context), wake up a kthread which will resubmit I/O from
user context.  This allows mke2fs to run to completion.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:58 -04:00
Matthew Wilcox eeee322647 NVMe: Handle failures differently in nvme_submit_bio_queue()
Return -EBUSY if the queue is full or -ENOMEM if we failed to allocate
memory (or map a scatterlist).  Also use GFP_ATOMIC to allocate the
nvme_bio and move the locking to the callers of nvme_submit_bio_queue().

In nvme_make_request(), don't permit an I/O to jump the queue -- if the
congestion list already has an entry, just add to the tail, rather than
trying to submit.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:58 -04:00
Matthew Wilcox 768308400f NVMe: Handle physical merging of bvec entries
In order to not overrun the sg array, we have to merge physically
contiguous pages into a single sg entry.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:57 -04:00
Matthew Wilcox 1974b1ae88 NVMe: Check for DMA mapping failure
If dma_map_sg returns 0 (failure), we need to fail the I/O.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:57 -04:00
Matthew Wilcox d567760c40 NVMe: Pass the nvme_dev to nvme_free_prps and nvme_setup_prps
We were passing the nvme_queue to access the q_dmadev for the
dma_alloc_coherent calls, but since we moved to the dma pool API,
we really only need the nvme_dev.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:57 -04:00
Matthew Wilcox 99802a7aee NVMe: Optimise memory usage for I/Os between 4k and 128k
Add a second memory pool for smaller I/Os.  We can pack 16 of these on a
single page instead of using an entire page for each one.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:57 -04:00
Matthew Wilcox 091b609258 NVMe: Switch to use DMA Pool API
Calling dma_free_coherent from interrupt context causes warnings.
Using the DMA pools delays freeing until pool destruction, so avoids
the problem.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:57 -04:00
Matthew Wilcox d534df3c73 NVMe: Rename nvme_req_info to nvme_bio
There are too many things called 'info' in this driver.  This data
structure is auxiliary information for a struct bio, so call it nvme_bio,
or nbio when used as a variable.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:56 -04:00
Shane Michael Matthews e025344c56 NVMe: Initial PRP List support
Add a pointer to the nvme_req_info to hold a new data structure
(nvme_prps) which contains a list of the pages allocated to this
particular request for holding PRP list entries.  nvme_setup_prps()
now returns this pointer.

To allocate and free the memory used for PRP lists, we need a struct
device, so we need to pass the nvme_queue pointer to many functions
which didn't use to need it.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:56 -04:00
Matthew Wilcox 51882d00f0 NVMe: Advance the sg pointer when filling in an sg list
For multipage BIOs, we were always using sg[0] instead of advancing
through the list.  Oops :-)

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:56 -04:00
Matthew Wilcox d2d8703481 NVMe: Renumber the special context values
If POISON_POINTER_DELTA isn't defined, ensure they're in page 0 which
should never be mapped.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:56 -04:00
Matthew Wilcox 9294bbed78 NVMe: Handle the congestion list a little better
In the bio completion handler, check for bios on the congestion list
for this NVM queue.  Also, lock the congestion list in the make_request
function as the queue may end up being shared between multiple CPUs.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:56 -04:00
Matthew Wilcox e85248e516 NVMe: Record the timeout for each command
In addition to recording the completion data for each command, record
the anticipated completion time.  Choose a timeout of 5 seconds for
normal I/Os and 60 seconds for admin I/Os.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:56 -04:00
Matthew Wilcox ec6ce618d6 NVMe: Need to lock queue during interrupt handling
If we're sharing a queue between multiple CPUs and we cancel a sync I/O,
we must have the queue locked to avoid corrupting the stack of the thread
that submitted the I/O.  It turns out this is the same locking that's needed
for the threaded irq handler, so share that code.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:56 -04:00
Matthew Wilcox 48e3d39816 NVMe: Detect command IDs completing that are out of range
If the adapter completes a command ID that is outside the bounds of
the array, return CMD_CTX_INVALID instead of random data, and print a
message in the sync_completion handler (which is rapidly becoming the
misc completion handler :-)

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:55 -04:00
Matthew Wilcox b36235df01 NVMe: Detect commands that are completed twice
Set the context value to CMD_CTX_COMPLETED, and print a message in the
sync_completion handler if we see it.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:55 -04:00
Matthew Wilcox be7b62754e NVMe: Use a symbolic name to represent cancelled commands instead of 0
I have plans for other special values in sync_completion.  Plus, this
is more self-documenting, and lets us detect bogus usages.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:55 -04:00
Matthew Wilcox 58ffacb545 NVMe: Add a module parameter to use a threaded interrupt
We're currently calling bio_endio from hard interrupt context.  This is
not a good idea for preemptible kernels as it will cause longer latencies.
Using a threaded interrupt will run the entire queue processing mechanism
(including bio_endio) in a thread, which can be preempted.  Unfortuantely,
it also adds about 7us of latency to the single-I/O case, so make it a
module parameter for the moment.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:55 -04:00
Matthew Wilcox b1ad37efca NVMe: Call put_nvmeq() before calling nvme_submit_sync_cmd()
We can't have preemption disabled when we call schedule().  Accept the
possibility that we'll get preempted, and it'll cost us some cacheline
bounces.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:55 -04:00
Matthew Wilcox 3c0cf138d7 NVMe: Allow fatal signals to interrupt I/O
If the user sends a fatal signal, sleeping in the TASK_KILLABLE state
permits the task to be aborted.  The only wrinkle is making sure that
if/when the command completes later that it doesn't upset anything.
Handle this by setting the data pointer to 0, and checking the value
isn't NULL in the sync completion path.  Eventually, bios can be cancelled
through this path too.  Note that the cmdid isn't freed to prevent reuse.

We should also abort the command in the future, but this is a good start.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:55 -04:00
Matthew Wilcox db5d0c198d NVMe: Release 0.2
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:54 -04:00
Matthew Wilcox 6ee44cdced NVMe: Add download / activate firmware ioctls
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:54 -04:00
Matthew Wilcox 388f037f4e NVMe: Move sysfs entries to the right place
Because I wasn't setting driverfs_dev, the devices were showing up under
/sys/devices/virtual/block.  Now they appear underneath the PCI device
which they belong to.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:54 -04:00
Shane Michael Matthews 5911f20039 NVMe: Disable the device before we write the admin queues
In case the card has been left in a partially-configured state,
write 0 to the Enable bit.

Signed-off-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:54 -04:00
Matthew Wilcox 574e8b95bc NVMe: Request I/O regions
Calling pci_request_selected_regions() reserves these regions for our use.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:54 -04:00
Matthew Wilcox 2930353f9f NVMe: Allow queues to be allocated above 4GB
Need to call dma_set_coherent_mask() to allow queues to be allocated
above 4GB.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:53 -04:00
Matthew Wilcox f64d3365a3 NVMe: Enable device DMA
Need to call pci_set_master() to enable device DMA

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:53 -04:00
Shane Michael Matthews 0ee5a7d7cb NVMe: Enable and disable the PCI device
Call pci_enable_device_mem() at initialisation and pci_disable_device
at exit.

Signed-off-by: Shane Michael Matthews <shane.matthews@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:53 -04:00
Matthew Wilcox 3f85d50b60 NVMe: Check returns from nvme_alloc_queue()
It can return NULL, so handle that.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:53 -04:00
Matthew Wilcox 8e9f0e7115 NVMe: Remove 'node' from nvme_dev
We don't keep a list of nvme_dev any more

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:53 -04:00
Matthew Wilcox 51814232ec NVMe: Read the model, serial & firmware rev from the controller
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:53 -04:00
Matthew Wilcox a53295b699 NVMe: Add NVME_IOCTL_SUBMIT_IO
Allow userspace to submit synchronous I/O like the SCSI sg interface does.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:53 -04:00
Matthew Wilcox 7fc3cdabba NVMe: Create nvme_map_user_pages() and nvme_unmap_user_pages()
These are generalisations of the code that was in
nvme_submit_user_admin_command().

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:52 -04:00
Matthew Wilcox bd38c5557c NVMe: Change NVME_IOCTL_GET_RANGE_TYPE to return all the ranges
Factor out most of nvme_identify() into a new nvme_submit_user_admin_command()
function.  Change nvme_get_range_type() to call it and change nvme_ioctl to
realise that it's getting back all 64 ranges.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:52 -04:00
Matthew Wilcox b8deb62cf2 NVMe: Zero the command before we send it
Make sure there's no left-over bits set from previous commands that used
this slot.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:52 -04:00
Matthew Wilcox ff22b54fda NVMe: Add nvme_setup_prps()
Generalise the code from nvme_identify() that sets PRP1 & PRP2 so that
it's usable for commands sent by nvme_submit_bio_queue().

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:52 -04:00
Matthew Wilcox 36c14ed9ca NVMe: Use PRP2 for the nvme_identify ioctl
DMA the result straight to userspace instead of bounce-buffering in the
kernel.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:52 -04:00
Matthew Wilcox 53c9577e9c NVMe: Fix admin IRQ claim on real hardware
The admin IRQ is supposed to use the pin-based (or single message MSI)
interrupt.  Accomplish this by filling in entry[0]'s vector with the
INTx irq number.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:51 -04:00
Matthew Wilcox 821234603b NVMe: Rename 'cycle' to 'phase'
It's called the phase bit in the current draft

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:51 -04:00
Matthew Wilcox 1b23484bd0 NVMe: Implement per-CPU queues
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:51 -04:00
Matthew Wilcox b3b06812e1 NVMe: Reduce set_queue_count arguments by one
sq_count and cq_count are always the same, so just call it 'count'.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:51 -04:00
Matthew Wilcox 3001082cac NVMe: Factor out queue_request_irq()
Two callers with an almost identical long string of arguments, and
introducing a third soon.  Time to factor out the commonalities.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:51 -04:00
Matthew Wilcox b60503ba43 NVMe: New driver
This driver is for devices that follow the NVM Express standard

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2011-11-04 15:52:51 -04:00
Michael S. Tsirkin 5087a50e66 virtio-blk: use ida to allocate disk index
Based on a patch by Mark Wu <dwu@redhat.com>

Current index allocation in virtio-blk is based on a monotonically
increasing variable "index". This means we'll run out of numbers
after a while.  It also could cause confusion about the disk
name in the case of hot-plugging disks.
Change virtio-blk to use ida to allocate index, instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-11-02 11:41:02 +10:30
Paul Gortmaker 0c8d44f239 block: Fix files that are modules and hence need module.h
We want to remove the implicit everywhere presence of module.h
so fix up the people relying on that implicit presence in advance.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:13 -04:00
Paul Gortmaker d5decd3b95 block: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros
These files were getting <linux/module.h> via an implicit include
path, but we want to crush those out of existence since they cost
time during compiles of processing thousands of lines of headers
for no reason.  Give them the lightweight header that just contains
the EXPORT_SYMBOL infrastructure.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:12 -04:00
Michael S. Tsirkin a0eda62552 virtio-blk: use ida to allocate disk index
Based on a patch by Mark Wu <dwu@redhat.com>

Current index allocation in virtio-blk is based on a monotonically
increasing variable "index". This means we'll run out of numbers
after a while.  It also could cause confusion about the disk
name in the case of hot-plugging disks.
Change virtio-blk to use ida to allocate index, instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-31 08:05:36 +01:00
Linus Torvalds 97d2eb13a0 Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-client
* 'for-linus' of git://ceph.newdream.net/git/ceph-client:
  libceph: fix double-free of page vector
  ceph: fix 32-bit ino numbers
  libceph: force resend of osd requests if we skip an osdmap
  ceph: use kernel DNS resolver
  ceph: fix ceph_monc_init memory leak
  ceph: let the set_layout ioctl set single traits
  Revert "ceph: don't truncate dirty pages in invalidate work thread"
  ceph: replace leading spaces with tabs
  libceph: warn on msg allocation failures
  libceph: don't complain on msgpool alloc failures
  libceph: always preallocate mon connection
  libceph: create messenger with client
  ceph: document ioctls
  ceph: implement (optional) max read size
  ceph: rename rsize -> rasize
  ceph: make readpages fully async
2011-10-28 16:42:18 -07:00
David Vrabel 2d073846b8 block: xen-blkback: use API provided by xenbus module to map rings
The xenbus module provides xenbus_map_ring_valloc() and
xenbus_map_ring_vfree().  Use these to map the ring pages granted by
the frontend.

Acked-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-26 10:02:35 -04:00
Sage Weil 6ab00d465a libceph: create messenger with client
This simplifies the init/shutdown paths, and makes client->msgr available
during the rest of the setup process.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25 16:10:15 -07:00
Linus Torvalds 59e5253417 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (59 commits)
  MAINTAINERS: linux-m32r is moderated for non-subscribers
  linux@lists.openrisc.net is moderated for non-subscribers
  Drop default from "DM365 codec select" choice
  parisc: Kconfig: cleanup Kernel page size default
  Kconfig: remove redundant CONFIG_ prefix on two symbols
  cris: remove arch/cris/arch-v32/lib/nand_init.S
  microblaze: add missing CONFIG_ prefixes
  h8300: drop puzzling Kconfig dependencies
  MAINTAINERS: microblaze-uclinux@itee.uq.edu.au is moderated for non-subscribers
  tty: drop superfluous dependency in Kconfig
  ARM: mxc: fix Kconfig typo 'i.MX51'
  Fix file references in Kconfig files
  aic7xxx: fix Kconfig references to READMEs
  Fix file references in drivers/ide/
  thinkpad_acpi: Fix printk typo 'bluestooth'
  bcmring: drop commented out line in Kconfig
  btmrvl_sdio: fix typo 'btmrvl_sdio_sd6888'
  doc: raw1394: Trivial typo fix
  CIFS: Don't free volume_info->UNC until we are entirely done with it.
  treewide: Correct spelling of successfully in comments
  ...
2011-10-25 12:11:02 +02:00
Linus Torvalds 31018acd4c Merge branches 'stable/bug.fixes-3.2' and 'stable/mmu.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/p2m/debugfs: Make type_name more obvious.
  xen/p2m/debugfs: Fix potential pointer exception.
  xen/enlighten: Fix compile warnings and set cx to known value.
  xen/xenbus: Remove the unnecessary check.
  xen/irq: If we fail during msi_capability_init return proper error code.
  xen/events: Don't check the info for NULL as it is already done.
  xen/events: BUG() when we can't allocate our event->irq array.

* 'stable/mmu.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: Fix selfballooning and ensure it doesn't go too far
  xen/gntdev: Fix sleep-inside-spinlock
  xen: modify kernel mappings corresponding to granted pages
  xen: add an "highmem" parameter to alloc_xenballooned_pages
  xen/p2m: Use SetPagePrivate and its friends for M2P overrides.
  xen/p2m: Make debug/xen/mmu/p2m visible again.
  Revert "xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set."
2011-10-25 09:17:47 +02:00
Jens Axboe 83157223de Merge branch 'for-linus' into for-3.2/core 2011-10-24 16:24:38 +02:00
Mike Miller ab5dbebe33 cciss: add small delay when using PCI Power Management to reset for kump
The P600 requires a small delay when changing states. Otherwise we may think
the board did not reset and we bail. This for kdump only and is particular
to the P600.

Cc: stable@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-20 22:21:52 +02:00
Jens Axboe b8d8bdfe31 Merge branch 'stable/for-jens-3.2' of git://oss.oracle.com/git/kwilk/xen into for-3.2/drivers 2011-10-20 15:10:59 +02:00
Jens Axboe 5c04b426f2 Merge branch 'v3.1-rc10' into for-3.2/core
Conflicts:
	block/blk-core.c
	include/linux/blkdev.h

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-19 14:30:42 +02:00
Konrad Rzeszutek Wilk 6927d92091 xen/blkback: Fix two races in the handling of barrier requests.
There are two windows of opportunity to cause a race when
processing a barrier request. This patch fixes this.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-17 14:28:57 -04:00
Christoph Hellwig 456be1484f loop: remove the incorrect write_begin/write_end shortcut
Currently the loop device tries to call directly into write_begin/write_end
instead of going through ->write if it can.  This is a fairly nasty shortcut
as write_begin and write_end are only callbacks for the generic write code
and expect to be called with filesystem specific locks held.

This code currently causes various issues for clustered filesystems as it
doesn't take the required cluster locks, and it also causes issues for XFS
as it doesn't properly lock against the swapext ioctl as called by the
defragmentation tools.  This in case causes data corruption if
defragmentation hits a busy loop device in the wrong time window, as
reported by RH QA.

The reason why we have this shortcut is that it saves a data copy when
doing a transformation on the loop device, which is the technical term
for using cryptoloop (or an XOR transformation).  Given that cryptoloop
has been deprecated in favour of dm-crypt my opinion is that we should
simply drop this shortcut instead of finding complicated ways to to
introduce a formal interface for this shortcut.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-17 12:57:20 +02:00
Konrad Rzeszutek Wilk dda1852802 xen/blkback: Check for proper operation.
The patch titled: "xen/blkback: Fix the inhibition to map pages
when discarding sector ranges." had the right idea except that
it used the wrong comparison operator. It had == instead of !=.

This fixes the bug where all (except discard) operations would
have been ignored.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-14 12:29:55 -04:00
Lars Ellenberg 3cb7a2a90f drbd: get rid of drbd_bcast_ee, it is of no use anymore
This function was used to broadcast the (leading part of the)
bio payload in case we see a data integrity error.  It could be received
from userland with the drbdsetup events subcommand,
to have a peek into the payload that caused the checksum mismatch,
and guess from there what may have caused the mismatch,
mainly to guess wether it was modification of in-flight data,
or data corruption by broken hardware or software bugs.

Meanwhile we support bios that are larger than the maximum payload a
netlink datagram can carry.
And we have means to reliably detect modification of in-flight data by
calculating, and comparing, the checksum before and after sendmsg.
There is no need to carry this around anymore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:08 +02:00
Lars Ellenberg 569083c08d drbd: fix drbd_delete_device: remove vnr from volumes; idr_remove(); synchronize_rcu(); before cleanup
Still missing: rcu_readlock() on the various call sites that
access/iterate over those idrs.

We don't need a specific write lock, as we only modify from
configuration context, which is already strictly serialized.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:07 +02:00
Lars Ellenberg da4a75d2ef drbd: introduce a bio_set to allocate housekeeping bios from
Don't rely on availability of bios from the global fs_bio_set,
we should use our own bio_set for meta data IO.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:06 +02:00
Lars Ellenberg 9db4e77f8c drbd: use the newly introduced page pool for bitmap IO
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:05 +02:00
Lars Ellenberg 35abf59424 drbd: add page pool to be used for meta data IO
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:04 +02:00
Lars Ellenberg 3c13b680ce drbd: only wakeup if something changed in update_peer_seq
This commit got it wrong:
    drbd: Make the peer_seq updating code more obvious

    Make it more clear that update_peer_seq() is supposed to wake up the
    seq_wait queue whenever the sequence number changes.

We don't need to wake up everytime we receive a sequence number
that is _different_ from our currently stored "newest" sequence number,
but only if we receive a sequence number _newer_ than what we already
have, when we actually change mdev->peer_seq.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:04 +02:00
Lars Ellenberg 2c4a48d097 drbd: remove unused define
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:02 +02:00
Philipp Reisner 81a5d60ecf drbd: Replaced the minor_table array by an idr
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:01 +02:00
Philipp Reisner 774b305518 drbd: Implemented new commands to create/delete connections/minors
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:00 +02:00
Philipp Reisner 80883197da drbd: Converted drbd_nl_(net_conf|disconnect)() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:48:00 +02:00
Philipp Reisner 1aba4d7fcf drbd: Preparing the connector interface to operator on connections
Up to now it only operated on minor numbers. Now it can work also
on named connections.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:59 +02:00
Philipp Reisner 2f5cdd0b2c drbd: Converted the transfer log from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:58 +02:00
Philipp Reisner 49559d87fd drbd: Improved the dec_*() macros
Now those can be used with a struct drbd_conf * that has an other
name than 'mdev'.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:57 +02:00
Philipp Reisner 3f9cbe937e drbd: Removed the mdev parameter from the ..to_tags() and ...from_tags() functions
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:56 +02:00
Philipp Reisner 0e29d163f7 drbd: Reworked the unconfiguring and thread stopping code
* Moved CONFIG_PENDING and DEVICE_DYING from mdev to tconn.
* Renamed drbd_reconfig_start() and drbd_reconfig_done() to
  conn_reconfig_start() and conn_reconfig_done().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:55 +02:00
Andreas Gruenbacher c66342d949 drbd: Remove left-over function prototypes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:55 +02:00
Andreas Gruenbacher 7201b972de drbd: Replace get_asender_cmd() with its implementation
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:54 +02:00
Andreas Gruenbacher 6e849ce88c drbd: Get rid of P_MAX_CMD
Instead of artificially enlarging the command decoding arrays to
P_MAX_CMD entries, check if an index is within the valid range using the
ARRAY_SIZE() macro.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:53 +02:00
Andreas Gruenbacher 1b3bb47d52 drbd: Remove redundant check
Opening a device only succeeds on a primary node, or when explicitly
setting the allow_oos module parameter to allow opening the device
read-only on a secondary node.  There is no other way that a request can
get into drbd_make_request(), so this code cannot trigger.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:52 +02:00
Andreas Gruenbacher 7be8da0798 drbd: Improve how conflicting writes are handled
The previous algorithm for dealing with overlapping concurrent writes
was generating unnecessary warnings for scenarios which could be
legitimate, and did not always handle partially overlapping requests
correctly.  Improve it algorithm as follows:

* While local or remote write requests are in progress, conflicting new
  local write requests will be delayed (commit 82172f7).

* When a conflict between a local and remote write request is detected,
  the node with the discard flag decides how to resolve the conflict: It
  will ask its peer to discard conflicting requests which are fully
  contained in the local request and retry requests which overlap only
  partially.  This involves a protocol change.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:51 +02:00
Andreas Gruenbacher 71b1c1eb9c drbd: Use ping-timeout when waiting for missing ack packets
When the node with the discard flag resolves write conflicts in
dual-primary mode, it may determine that its peer has sent ack packets
on the metadata socket which did not arrive, yet.  Wait for the next ack
with ping-timeout instead of a hard-coded 30 seconds.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:51 +02:00
Andreas Gruenbacher 8ccf218e9f drbd: Replace atomic_add_return with atomic_inc_return
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:50 +02:00
Andreas Gruenbacher 206d358941 drbd: Concurrent write detection fix
Commit 9b1e63e changed the concurrent write detection algorithm to only insert
peer requests into write_requests tree after determining that there is no
conflict.  With this change, new conflicting local requests could be added
while the algorithm runs, but this case was not handled correctly.  Instead of
making the algorithm deal with this case, switch back to adding peer requests
to the write_requests tree immediately: this improves fairness.

When a peer request is discarded, remove that request from the write_requests

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:49 +02:00
Andreas Gruenbacher 8050e6d005 drbd: Use container_of() instead of casting
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:48 +02:00
Lars Ellenberg 9676c76097 drbd: fix a wrong likely(), updated comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:47 +02:00
Lars Ellenberg c9d963a46d drbd: silence some log messages on bitmap IO
Summary log messages meant for global bitmap IO
should not be printed for bitmap IO caused by
activity log transactions.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:47 +02:00
Lars Ellenberg 7ad651b522 drbd: new on-disk activity log transaction format
Use a new on-disk transaction format for the activity log, which allows
for multiple changes to the active set per transaction.

Using 4k transaction blocks, we can now get rid of the work-around code
to deal with devices not supporting 512 byte logical block size.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:46 +02:00
Lars Ellenberg 46a15bc3ec lru_cache: allow multiple changes per transaction
Allow multiple changes to the active set of elements in lru_cache.
The only current user of lru_cache, drbd, is driving this generalisation.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:45 +02:00
Lars Ellenberg 45dfffebd0 drbd: allow to select specific bitmap pages for writeout
We are about to allow several changes to the active set in one activity
log transaction. We have to write out the corresponding bitmap pages as
well, if changed.

Introduce drbd_bm_mark_for_writeout(), then re-use the existing bitmap
writeout path to submit all marked pages in one go.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:44 +02:00
Lars Ellenberg 4738fa1690 drbd: use clear_bit_unlock() where appropriate
Some open-coded clear_bit(); smp_mb__after_clear_bit();
should in fact have been smp_mb__before_clear_bit(); clear_bit();

Instead, use clear_bit_unlock() to annotate the intention,
and have it do the right thing.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:42 +02:00
Lars Ellenberg 61610420f7 drbd: in drbd_suspend_al, set AL_SUSPENDED before unlocking the activity log
As using an empty activity log is the whole point of the excercise,
make sure it is still empty when setting AL_SUSPENDED.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:41 +02:00
Lars Ellenberg 867f57483b drbd: fix typo in comment
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:40 +02:00
Lars Ellenberg 8c387def58 drbd: simplify condition in drbd_may_do_local_read()
fold
	if (x >= (N+1))
		return 0;
	if (x < N)
		return 0;
into
	if (x != N)
		return 0;

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:39 +02:00
Andreas Gruenbacher c670a39867 drbd: Use the IS_ALIGNED() macro in some more places
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:39 +02:00
Andreas Gruenbacher 8ca9844f10 drbd: Remove obsolete comment
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:38 +02:00
Andreas Gruenbacher d0e22a260c drbd: Iterate over all overlapping intervals in a tree
Add a macro and helper function for doing that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:37 +02:00
Andreas Gruenbacher fcefa62e4c drbd: Rename drbd_endio_{pri,sec} -> drbd_{,peer_}request_endio
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:36 +02:00
Andreas Gruenbacher fbe29dec98 drbd: Rename drbd_submit_ee -> drbd_submit_peer_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:35 +02:00
Philipp Reisner df24aa45f4 drbd: Implemented connection wide state changes
That is used for graceful disconnect only

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:32 +02:00
Philipp Reisner 047cd4a682 drbd: implemented receiving of P_CONN_ST_CHG_REQ
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:05 +02:00
Philipp Reisner fc3b10a45f drbd: Implemented receiving of P_CONN_ST_CHG_REPLY
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:04 +02:00
Philipp Reisner 5aabf467e3 drbd: Global_state_lock not necessary here...
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:03 +02:00
Philipp Reisner cf29c9d8c8 drbd: Implemented conn_send_state_req()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:02 +02:00
Philipp Reisner 8410da8f0e drbd: Introduced tconn->cstate_mutex
In compatibility mode with old DRBDs, use that as the state_mutex
as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:01 +02:00
Philipp Reisner dad2055481 drbd: Removed drbd_state_lock() and drbd_state_unlock()
The lock they constructed is only taken when the state_mutex
was already taken. It is superficial.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:45:01 +02:00
Philipp Reisner bbeb641c3e drbd: Killed volume0; last step of multi-volume-enablement
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:44:58 +02:00
Konrad Rzeszutek Wilk 64391b2536 xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
The 'operation' parameters are the ones provided to the bio layer while
the req->operation are the ones passed in between the backend and
frontend. We used the wrong 'operation' value to squash the
call to map pages when processing the discard operation resulting
in an hypercall that did nothing. Lets guard against going in the
mapping function by checking for the proper operation type.

CC: Li Dongyang <lidongyang@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:38 -04:00
Konrad Rzeszutek Wilk 5c62cb4860 xen/blkback: Report VBD_WSECT (wr_sect) properly.
We did not increment the amount of sectors written to disk
b/c we tested for the == WRITE which is incorrect - as the
operations are more of WRITE_FLUSH, WRITE_ODIRECT. This patch
fixes it by doing a & WRITE check.

CC: stable@kernel.org
Reported-by: Andy Burns <xen.lists@burns.me.uk>
Suggested-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:37 -04:00
Konrad Rzeszutek Wilk 29bde09378 xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests.
We emulate the barrier requests by draining the outstanding bio's
and then sending the WRITE_FLUSH command. To drain the I/Os
we use the refcnt that is used during disconnect to wait for all
the I/Os before disconnecting from the frontend. We latch on its
value and if it reaches either the threshold for disconnect or when
there are no more outstanding I/Os, then we have drained all I/Os.

Suggested-by: Christopher Hellwig <hch@infradead.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:36 -04:00
Laszlo Ersek 469738e675 xen-blkfront: plug device number leak in xlblk_init() error path
... though after a failed xenbus_register_frontend() all may be lost.

Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:35 -04:00
Konrad Rzeszutek Wilk d11e615830 xen-blkfront: If no barrier or flush is supported, use invalid operation.
Guard against issuing BLKIF_OP_WRITE_BARRIER or BLKIF_OP_FLUSH_CACHE
by checking whether we successfully negotiated with the backend.
The negotiation with the backend also sets the q->flush_flags which
fortunately for us is also used when submitting an bio to us. If
we don't support barriers or flushes it would be set to zero so
we should never end up having to deal with REQ_FLUSH | REQ_FUA.

However, other third party implementations of __make_request that
might be stacked on top of us might not be so smart, so lets fix this up.

Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:35 -04:00
Jan Beulich 8e6dc6fe51 xen-blkback: use kzalloc() in favor of kmalloc()+memset()
This fixes the problem of three of those four memset()-s having
improper size arguments passed: Sizeof a pointer-typed expression
returns the size of the pointer, not that of the pointed to data.

It also reverts using kmalloc() instead of kzalloc() for the allocation
of the pending grant handles array, as that array gets fully
initialized in a subsequent loop.

Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:34 -04:00
Joe Jin c555aab97d xen-blkback: fixed indentation and comments
This patch fixes belows:

1. Fix code style issue.
2. Fix incorrect functions name in comments.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:33 -04:00
Li Dongyang 69ef68cef9 xen-blkfront: fix a deadlock while handling discard response
When we get -EOPNOTSUPP response for a discard request, we will clear
the discard flag on the request queue so we won't attempt to send discard
requests to backend again, and this should be protected under rq->queue_lock.
However, when we setup the request queue, we pass blkif_io_lock to
blk_init_queue so rq->queue_lock is blkif_io_lock indeed, and this lock
is already taken when we are in blkif_interrpt, so remove the
spin_lock/spin_unlock when we clear the discard flag or we will end up
with deadlock here

Signed-off-by: Li Dongyang <lidongyang@novell.com>
[v1: Updated description a bit and removed comment from source]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:32 -04:00
Li Dongyang ed30bf317c xen-blkfront: Handle discard requests.
If the backend advertises 'feature-discard', then interrogate
the backend for alignment and granularity. Setup the request
queue with the appropiate values and send the discard operation
as required.

Signed-off-by: Li Dongyang <lidongyang@novell.com>
[v1: Amended commit description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:31 -04:00
Li Dongyang b3cb0d6adc xen-blkback: Implement discard requests ('feature-discard')
..aka ATA TRIM/SCSI UNMAP command to be passed through the frontend
and used as appropiately by the backend. We also advertise
certain granulity parameters to the frontend so it can plug them in.
If the backend is a realy device - we just end up using
'blkdev_issue_discard' while for loopback devices - we just punch
a hole in the image file.

Signed-off-by: Li Dongyang <lidongyang@novell.com>
[v1: Fixed up pr_debug and commit description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-13 09:48:30 -04:00
Stefano Stabellini 0930bba674 xen: modify kernel mappings corresponding to granted pages
If we want to use granted pages for AIO, changing the mappings of a user
vma and the corresponding p2m is not enough, we also need to update the
kernel mappings accordingly.
Currently this is only needed for pages that are created for user usages
through /dev/xen/gntdev. As in, pages that have been in use by the
kernel and use the P2M will not need this special mapping.
However there are no guarantees that in the future the kernel won't
start accessing pages through the 1:1 even for internal usage.

In order to avoid the complexity of dealing with highmem, we allocated
the pages lowmem.
We issue a HYPERVISOR_grant_table_op right away in
m2p_add_override and we remove the mappings using another
HYPERVISOR_grant_table_op in m2p_remove_override.
Considering that m2p_add_override and m2p_remove_override are called
once per page we use multicalls and hypercall batching.

Use the kmap_op pointer directly as argument to do the mapping as it is
guaranteed to be present up until the unmapping is done.
Before issuing any unmapping multicalls, we need to make sure that the
mapping has already being done, because we need the kmap->handle to be
set correctly.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v1: Removed GRANT_FRAME_BIT usage]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 10:32:58 -04:00
Philipp Reisner 56707f9e87 drbd: Code de-duplication; new function apply_mask_val()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:22 +02:00
Philipp Reisner 4308a0a390 drbd: Removed the os parameter form sanitize_state()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:21 +02:00
Philipp Reisner fda74117dc drbd: Extracted is_valid_conn_transition() out of is_valid_transition()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:20 +02:00
Philipp Reisner 3509502dc8 drbd: Extracted is_valid_transition() out of sanitize_state()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:19 +02:00
Philipp Reisner a75f34ad0c drbd: Renamed is_valid_state_transition() to is_valid_soft_transition()
And removed the unused mdev parameter, and made the order of
the state parameters: os, ns

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:18 +02:00
Philipp Reisner d50eee21c4 drbd: Extracted after_conn_state_ch() out of after_state_ch()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:17 +02:00
Philipp Reisner 2a67d8b93b drbd: Converted drbd_send_ping() and related functions from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:16 +02:00
Philipp Reisner 00d56944ff drbd: Generalized the work callbacks
No longer work callbacks must operate on a mdev. From now on they
can also operate on a tconn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:15 +02:00
Philipp Reisner 6699b65533 drbd: Moved some initializing code into drbd_new_tconn()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:14 +02:00
Philipp Reisner 392c880192 drbd: drbd_thread has now a pointer to a tconn instead of to a mdev
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:13 +02:00
Philipp Reisner 19393e105f drbd: Converted drbd_worker() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:12 +02:00
Philipp Reisner 32862ec705 drbd: Converted drbd_asender() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:11 +02:00
Philipp Reisner 4d641dd7b0 drbd: Converted drbdd_init() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:10 +02:00
Philipp Reisner f1b3a6ec7d drbd: Consolidated the setup of the thread name into the framework
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:09 +02:00
Philipp Reisner a21e929827 drbd: Moved the mdev member into drbd_work (from drbd_request and drbd_peer_request)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:08 +02:00
Philipp Reisner 360cc74052 drbd: Converted drbd_free_sock() and drbd_disconnect() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:06 +02:00
Philipp Reisner eefc2f7de2 drbd: Converted drbdd() from mdev to tconn
The drbd_md_sync(mdev) happens in the after state change anyways...

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:05 +02:00
Philipp Reisner 808222845d drbd: Converted drbd_calc_cpu_mask() and drbd_thread_current_set_cpu() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:04 +02:00
Philipp Reisner 907599e044 drbd: Converted drbd_connect() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:03 +02:00
Philipp Reisner 062e879c8b drbd: Use and idr data structure to map volume numbers to mdev pointers
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:02 +02:00
Philipp Reisner dc8228d107 drbd: Converted drbd_send_protocol() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:01 +02:00
Philipp Reisner 13e6037dc9 drbd: Converted drbd_do_auth() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:00 +02:00
Philipp Reisner 611208706f drbd: Converted drbd_(get|put)_data_sock() and drbd_send_cmd2() to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:32:59 +02:00
Philipp Reisner 65d11ed6f2 drbd: Converted drbd_do_handshake() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:32:58 +02:00
Philipp Reisner 9ba7aa00ae drbd: Converted drbd_recv_header() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:32:57 +02:00
Philipp Reisner ce24385342 drbd: Converted decode_header() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:32:56 +02:00
Philipp Reisner 77351055b5 drbd: struct packet_info to hold information of decoded packets
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:32:50 +02:00
Philipp Reisner de0ff338d6 drbd: Converted drbd_recv() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:29:52 +02:00
Philipp Reisner 8a22cccc20 drbd: Converted drbd_send_handshake() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:29:50 +02:00
Philipp Reisner a25b63f1e7 drbd: Converted drbd_recv_fp() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:29:49 +02:00
Philipp Reisner dbd9eea094 drbd: Removed unused mdev argument from drbd_recv_short() and drbd_socket_okay()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:29:48 +02:00
Philipp Reisner d38e787ecc drbd: Converted drbd_send_fp() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:29:45 +02:00
Philipp Reisner bedbd2a53a drbd: Converted drbd_send() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:27:00 +02:00
Philipp Reisner 1a7ba646e9 drbd: Converted helper functions for drbd_send() to tconn
* drbd_update_congested()
* we_should_drop_the_connection()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:59 +02:00
Philipp Reisner 0625ac190d drbd: Converted wake_asender() and request_ping() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:58 +02:00
Philipp Reisner 808e37b803 drbd: Moved SIGNAL_ASENDER to the per connection (tconn) flags
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:57 +02:00
Philipp Reisner e43ef195f8 drbd: Moved SEND_PING to the per connection (tconn) flags
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:56 +02:00
Philipp Reisner 25703f8320 drbd: Moved DISCARD_CONCURRENT to the per connection (tconn) flags
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:55 +02:00
Philipp Reisner 01a311a589 drbd: Started to separated connection flags (tconn) from block device flags (mdev)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:54 +02:00
Philipp Reisner 7653620de3 drbd: Converted drbd_wait_for_connect() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:53 +02:00
Philipp Reisner eac3e990e4 drbd: Converted drbd_try_connect() from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:52 +02:00
Philipp Reisner 60ae496626 drbd: conn_printk() a dev_printk() alike for drbd's connections
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:50 +02:00
Philipp Reisner b53339fce2 drbd: Moving state related macros to drbd_state.h
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:49 +02:00
Philipp Reisner 8ea62f5464 drbd: Revert "Make sure we dont send state if a cluster wide state change is in progress"
This reverts commit 6e9fdc92b77915d5c7ab8fea751f48378f8b0080.

1) This did not fixed the issue
2) Long sleeping work items can cause IO requests to take as long as
   the longest work item

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:48 +02:00
Philipp Reisner e64a329459 drbd: Do no sleep long in drbd_start_resync
Work items that sleep too long can cause requests to take as
long as the longest sleeping work item.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:47 +02:00
Philipp Reisner 1f04af33fe drbd: Moved code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:46 +02:00
Philipp Reisner bc31fe3352 drbd: Eliminated the user of drbd_task_to_thread()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:45 +02:00
Philipp Reisner bed879ae90 drbd: Moved the thread name into the data structure
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:44 +02:00
Philipp Reisner b890733953 drbd: Moved the state functions into its own source file
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:43 +02:00
Andreas Gruenbacher db830c464b drbd: Local variable renames: e -> peer_req
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:42 +02:00
Andreas Gruenbacher 6c852beca1 drbd: Update some comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:41 +02:00
Andreas Gruenbacher 18b75d756b drbd: Clean up some left-overs
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:40 +02:00
Andreas Gruenbacher f6ffca9f42 drbd: Rename struct drbd_epoch_entry to struct drbd_peer_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:39 +02:00
Andreas Gruenbacher c6f7df42c9 drbd: Remove unused variable in struct drbd_conf
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:37 +02:00
Andreas Gruenbacher 70b1987663 drbd: Improve the drbd_find_overlap() documentation
Describe how to reach any further overlapping intervals from the first
overlap found.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:36 +02:00
Andreas Gruenbacher 43ae077d0a drbd: Make the peer_seq updating code more obvious
Make it more clear that update_peer_seq() is supposed to wake up the
seq_wait queue whenever the sequence number changes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:35 +02:00
Andreas Gruenbacher 6024fece73 drbd: Defer new writes when detecting conflicting writes
Before submitting a new local write request, wait for any conflicting
local or remote requests to complete.

We could assume that the new request occurred first and that the
conflicting requests overwrote it (and therefore discard the new
reques), but we know for sure that the new request occurred after the
conflicting requests and so this behavior would we weird.  We would also
end up with the wrong result if the new request is not fully contained
within the conflicting requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:34 +02:00
Andreas Gruenbacher ddd8877d31 drbd: Remove unnecessary reference counting left-over
Nothing in this function accesses mdev->tconn->net_conf, so there is no
need for get_net_conf() / put_net_conf() anymore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:33 +02:00
Andreas Gruenbacher 5e4722645a drbd: _req_conflicts(): Get rid of the epoch_entries tree
Instead of keeping a separate tree for local and remote write requests
for finding requests and for conflict detection, use the same tree for
both purposes.  Introduce a flag to allow distinguishing the two
possible types of entries in this tree.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:32 +02:00
Andreas Gruenbacher 53840641bb drbd: Allow to wait for the completion of an epoch entry as well
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:31 +02:00
Andreas Gruenbacher 3e05146f0a drbd: Remove redundant check from drbd_contains_interval()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:30 +02:00
Andreas Gruenbacher a500c2efbb drbd: struct drbd_request: Introduce a new collision flag
This flag is set when a processes puts itself to sleep to wait for a
conflicting request to complete.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:29 +02:00
Andreas Gruenbacher 9e204cddaf drbd: Move some functions to where they are used
Move drbd_update_congested() to drbd_main.c, and drbd_req_new() and
drbd_req_free() to drbd_req.c: those functions are not used anywhere
else.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:28 +02:00
Andreas Gruenbacher 3e394da184 drbd: Move sequence number logic into drbd_receiver.c and simplify it
These things are only used there.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:27 +02:00
Andreas Gruenbacher cc378270e4 drbd: Initialize the sequence number sent over the network even when not used
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:26 +02:00
Andreas Gruenbacher bdc7adb006 drbd: Remove redundant initialization
packet_seq is initialized by both sides of a connection in
drbd_connect().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:25 +02:00
Andreas Gruenbacher d876302306 drbd: Rename "enum drbd_packets" to "enum drbd_packet"
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:24 +02:00
Andreas Gruenbacher f2ad906379 drbd: Move cmdname() out of drbd_int.h
There is no good reason for cmdname() to be an inline function.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:23 +02:00
Philipp Reisner b42a70ad32 drbd: Do not access tconn after it was freed
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:22 +02:00
Philipp Reisner 257d0af689 drbd: Implemented receiving of new style packets on meta socket
Now drbd communication with protocol 100 actually works.
Replaced the remaining p_header80 with p_header where we
no longer know which header it is.

In the places where p_header80 is still in use, it is on
purpose, because we know that it is an old style header
there.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:18 +02:00
Philipp Reisner fd340c12c9 drbd: Use new header layout
The new header layout will only be used if the peer supports
it of course.

For the first packet and the handshake packet the old (h80)
layout is used for compatibility reasons.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:23:03 +02:00
Carsten Emde 6c4867f646 floppy: use del_timer_sync() in init cleanup
When no floppy is found the module code can be released while a timer
function is pending or about to be executed.

CPU0                                  CPU1
				      floppy_init()
timer_softirq()
   spin_lock_irq(&base->lock);
   detach_timer();
   spin_unlock_irq(&base->lock);
   -> Interrupt
					del_timer();
				        return -ENODEV;
                                      module_cleanup();
   <- EOI
   call_timer_fn();
   OOPS

Use del_timer_sync() to prevent this.

Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-09-21 10:22:11 +02:00
Ayan George 4c823cc3d5 drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd()
If the loop device is associated (lo->lo_state == Lo_bound), it will have
a valid bdev pointed to by lo->lo_device.  There is no reason to ever pass
an additional block_device pointer.

Signed-off-by: Ayan George <ayan.george@canonical.com>
Cc: Phillip Susi <psusi@cfl.rr.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-09-21 10:02:13 +02:00
Phillip Susi 8a9c594422 drivers/block/loop.c: emit uevent on auto release
The loopback driver failed to emit the change uevent when auto releasing
the device.  Fixed lo_release() to pass the bdev to loop_clr_fd() so it
can emit the event.

Signed-off-by: Phillip Susi <psusi@cfl.rr.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ayan George <ayan@ayan.net>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-09-21 10:02:13 +02:00
Sergei Shtylyov 5a3a76e6c3 drivers/block/cpqarray.c: use pci_dev->revision
This driver uses PCI_CLASS_REVISION instead of PCI_REVISION_ID, so it
wasn't converted by commit 44c10138fd ("PCI: Change all drivers to
use pci_device->revision").

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Cc: Chirag Kantharia <chirag.kantharia@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-09-21 10:02:13 +02:00
Jiri Kosina e060c38434 Merge branch 'master' into for-next
Fast-forward merge with Linus to be able to merge patches
based on more recent version of the tree.
2011-09-15 15:08:18 +02:00
Jesper Juhl e5de063016 Remove unneeded version.h includes from drivers/block/
It was pointed out by 'make versioncheck' that some includes of
linux/version.h are not needed in drivers/block/.
This patch removes them.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-15 14:57:06 +02:00
Justin P. Mattock 699324871f treewide: remove extra semicolons from various parts of the kernel
This is a resend from the original, changing the title from PATCH to
RFC(since this is a review for commit, and I should have put that the first go around).
and also removing some of the commit's with ia64 and bash since it is significant.
let me know if I might have missed anything etc..

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-15 14:50:49 +02:00
Joe Perches 1d273b929c drbd: Use angle brackets for system includes
Use the normal include style.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-15 14:02:57 +02:00
Joe Perches 57f3224c3f drbd: Convert vmalloc/memset to vzalloc
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-15 13:55:02 +02:00
Christoph Hellwig 5a7bbad27a block: remove support for bio remapping from ->make_request
There is very little benefit in allowing to let a ->make_request
instance update the bios device and sector and loop around it in
__generic_make_request when we can archive the same through calling
generic_make_request from the driver and letting the loop in
generic_make_request handle it.

Note that various drivers got the return value from ->make_request and
returned non-zero values for errors.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-09-12 12:12:01 +02:00
Philipp Reisner c012949a40 drbd: Replaced all p_header80 with a generic p_header
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:26 +02:00
Philipp Reisner c6d25cfe52 drbd: Preparing to use p_header96 for all packets
recv_bm_rle_bits() should not make any assumptions abou the layout
of the packet header

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:25 +02:00
Philipp Reisner 191d3cc8d9 drbd: Made drbd_flush_workqueue() to take a tconn instead of an mdev
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:24 +02:00
Philipp Reisner a0638456c6 drbd: moved crypto transformations and friends from mdev to tconn
sed -i \
       -e 's/mdev->cram_hmac_tfm/mdev->tconn->cram_hmac_tfm/g' \
       -e 's/mdev->integrity_w_tfm/mdev->tconn->integrity_w_tfm/g' \
       -e 's/mdev->integrity_r_tfm/mdev->tconn->integrity_r_tfm/g' \
       -e 's/mdev->int_dig_out/mdev->tconn->int_dig_out/g' \
       -e 's/mdev->int_dig_in/mdev->tconn->int_dig_in/g' \
       -e 's/mdev->int_dig_vv/mdev->tconn->int_dig_vv/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:23 +02:00
Philipp Reisner 87eeee41f8 drbd: moved req_lock and transfer log from mdev to tconn
sed -i \
       -e 's/mdev->req_lock/mdev->tconn->req_lock/g' \
       -e 's/mdev->unused_spare_tle/mdev->tconn->unused_spare_tle/g' \
       -e 's/mdev->newest_tle/mdev->tconn->newest_tle/g' \
       -e 's/mdev->oldest_tle/mdev->tconn->oldest_tle/g' \
       -e 's/mdev->out_of_sequence_requests/mdev->tconn->out_of_sequence_requests/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:15 +02:00
Philipp Reisner 31890f4ab2 drbd: moved agreed_pro_version, last_received and ko_count to tconn
sed -i \
       -e 's/mdev->agreed_pro_version/mdev->tconn->agreed_pro_version/g' \
       -e 's/mdev->last_received/mdev->tconn->last_received/g' \
       -e 's/mdev->ko_count/mdev->tconn->ko_count/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:07 +02:00
Philipp Reisner e6b3ea83bc drbd: moved receiver, worker and asender from mdev to tconn
Patch mostly:
sed -i -e 's/mdev->receiver/mdev->tconn->receiver/g' \
       -e 's/mdev->worker/mdev->tconn->worker/g' \
       -e 's/mdev->asender/mdev->tconn->asender/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:06 +02:00
Philipp Reisner e42325a576 drbd: moved data and meta from mdev to tconn
Patch mostly:

sed -i -e 's/mdev->data/mdev->tconn->data/g' \
       -e 's/mdev->meta/mdev->tconn->meta/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:05 +02:00
Philipp Reisner b2fb6dbe52 drbd: moved net_cont and net_cnt_wait from mdev to tconn
Patch partly generated by:

sed -i -e 's/get_net_conf(mdev)/get_net_conf(mdev->tconn)/g' \
       -e 's/put_net_conf(mdev)/put_net_conf(mdev->tconn)/g' \
       -e 's/get_net_conf(odev)/get_net_conf(odev->tconn)/g' \
       -e 's/put_net_conf(odev)/put_net_conf(odev->tconn)/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:04 +02:00
Philipp Reisner 89e58e755e drbd: moved net_conf from mdev to tconn
Besides moving the struct member, everything else is generated by:

sed -i -e 's/mdev->net_conf/mdev->tconn->net_conf/g' \
       -e 's/odev->net_conf/odev->tconn->net_conf/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:03 +02:00
Philipp Reisner 2111438b30 drbd: Minimal struct drbd_tconn
Starting to dissolve the network connection from the actual
block devices.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:02 +02:00
Andreas Gruenbacher 6618bf1638 drbd: Interval tree bugfix
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:00 +02:00
Andreas Gruenbacher e3cfa7b26a drbd: Inline function overlaps() is now unused
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:59 +02:00
Andreas Gruenbacher 70dc65e1b3 drbd: Remove some useless paranoia code
The open_cnt check is an open-coded D_ASSERT() check.

In case the data.work queue is not empty, it does not really help to
know which drbd_work elements remained on that list: they will be freed
immediately afterwards, anyway.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:58 +02:00
Andreas Gruenbacher 841ce241fa drbd: Replace the ERR_IF macro with an assert-like macro
Remove the file name and line number from the syslog messages generated:
we have no duplicate function names, and no function contains the same
assertion more than once.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:57 +02:00
Andreas Gruenbacher e77a0a5cc1 drbd: Convert all constants in enum drbd_thread_state to upper case
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:56 +02:00
Andreas Gruenbacher 8554df1c6d drbd: Convert all constants in enum drbd_req_event to upper case
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:55 +02:00
Andreas Gruenbacher bb3bfe9614 drbd: Remove the unused hash tables
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:54 +02:00
Andreas Gruenbacher 8b946255f8 drbd: Use interval tree for overlapping epoch entry detection
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:53 +02:00
Andreas Gruenbacher 010f6e678f drbd: Put sector and size in struct drbd_epoch_entry into struct drbd_interval
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:52 +02:00
Andreas Gruenbacher bc9c5c4118 drbd: Use the read and write request trees for request lookups
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:50 +02:00
Andreas Gruenbacher dac1389ccc drbd: Add read_requests tree
We do not do collision detection for read requests, but we still need to
look up the request objects when we receive a package over the network.
Using the same data structure for read and write requests results in
simpler code once the tl_hash and app_reads_hash tables are removed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:31 +02:00
Andreas Gruenbacher de696716e8 drbd: Use interval tree for overlapping write request detection
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:06 +02:00
Andreas Gruenbacher ace652acf2 drbd: Put sector and size in struct drbd_request into struct drbd_interval
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:05 +02:00
Andreas Gruenbacher 0939b0e5cd drbd: Add interval tree data structure
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:04 +02:00
Andreas Gruenbacher c3afd8f568 drbd: Request lookup code cleanup (4)
Factor out duplicate code in got_NegAck().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:03 +02:00
Andreas Gruenbacher ae3388daae drbd: Request lookup code cleanup (3)
Get rid of the ar_id_to_req() and ack_id_to_req() wrappers.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:02 +02:00
Andreas Gruenbacher 668eebc6a1 drbd: Request lookup code cleanup (2)
Unify the ar_id_to_req() and ack_id_to_req() functions: make both fail
if the consistency check fails.  Move the request lookup code now
duplicated in both functions into its own function.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:01 +02:00
Andreas Gruenbacher 5162458564 drbd: Request lookup code cleanup (1)
Move _ar_id_to_req() to drbd_receiver.c and mark it non-inline.  Remove
the leading underscores from _ar_id_to_req() and _ack_id_to_req().  Mark
ar_hash_slot() inline.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:00 +02:00
Andreas Gruenbacher 9c50842a35 drbd: Update outdated comment
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:59 +02:00
Andreas Gruenbacher d628769b3c drbd: Move drbd_free_tl_hash() to drbd_main()
This is the only place where this function is used.  Make it static.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:58 +02:00
Andreas Gruenbacher 579b57ed73 drbd: Magic reserved block_id value cleanup
The ID_VACANT definition has become entirely irrelevant by now.

The is_syncer_block_id() macro does not improve the code, so eliminated
it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:57 +02:00
Andreas Gruenbacher e7fad8af75 drbd: Endianness convert the constants instead of the variables
Converting the constants happens at compile time.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:57 +02:00
Andreas Gruenbacher ca9bc12b90 drbd: Get rid of BE_DRBD_MAGIC and BE_DRBD_MAGIC_BIG
Converting the constants happens at compile time.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:56 +02:00
Andreas Gruenbacher 9a8e77530f drbd: Consistently use block_id == ID_SYNCER for checksum based resync and online verify
DRBD_MAGIC has nothing to do with block ids and the funny values
computed were not actually used, anyway.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:55 +02:00
Andreas Gruenbacher 3980485361 drbd: Remove superfluous declaration
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:43 +02:00
Andreas Gruenbacher 28c455ceb2 drbd: Get rid of req_validator_fn typedef
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:57:31 +02:00
Kay Sievers e03c8dd149 loop: always allow userspace partitions and optionally support automatic scanning
Automatic partition scanning can be requested individually per loop
device during its setup by setting LO_FLAGS_PARTSCAN. By default, no
partition tables are scanned.

Userspace can now always add and remove partitions from all loop
devices, regardless if the in-kernel partition scanner is enabled or
not.

The needed partition minor numbers are allocated from the extended
minors space, the main loop device numbers will continue to match the
loop minors, regardless of the number of partitions used.

  # grep . /sys/class/block/loop1/loop/*
  /sys/block/loop1/loop/autoclear:0
  /sys/block/loop1/loop/backing_file:/home/kay/data/stuff/part.img
  /sys/block/loop1/loop/offset:0
  /sys/block/loop1/loop/partscan:1
  /sys/block/loop1/loop/sizelimit:0

  # ls -l /dev/loop*
  brw-rw---- 1 root disk   7,   0 Aug 14 20:22 /dev/loop0
  brw-rw---- 1 root disk   7,   1 Aug 14 20:23 /dev/loop1
  brw-rw---- 1 root disk 259,   0 Aug 14 20:23 /dev/loop1p1
  brw-rw---- 1 root disk 259,   1 Aug 14 20:23 /dev/loop1p2
  brw-rw---- 1 root disk   7,  99 Aug 14 20:23 /dev/loop99
  brw-rw---- 1 root disk 259,   2 Aug 14 20:23 /dev/loop99p1
  brw-rw---- 1 root disk 259,   3 Aug 14 20:23 /dev/loop99p2
  crw------T 1 root root  10, 237 Aug 14 20:22 /dev/loop-control

Cc: Karel Zak  <kzak@redhat.com>
Cc: Davidlohr Bueso <dave@gnu.org>
Acked-By: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-23 20:12:04 +02:00
Jens Axboe 89c63a8ef3 Merge branch 'stable/for-jens' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus 2011-08-23 15:09:13 +02:00
Joe Jin 1bc05b0ae6 xen-blkback: fixed indentation and comments
This patch fixes belows:

1. Fix code style issue.
2. Fix incorrect functions name in comments.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-08-22 11:35:36 -04:00
Joe Jin 6f5986bce5 xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed.
When do block-attach/block-detach test with below steps, umount hangs
in the guest. Furthermore shutdown ends up being stuck when umounting file-systems.

1. start guest.
2. attach new block device by xm block-attach in Dom0.
3. mount new disk in guest.
4. execute xm block-detach to detach the block device in dom0 until timeout
5. Any request to the disk will hung.

Root cause:
This issue is caused when setting backend device's state to
'XenbusStateClosing', which sends to the frontend the XenbusStateClosing
notification. When frontend receives the notification it tries to release
the disk in blkfront_closing(), but at that moment the disk is still in use
by guest, so frontend refuses to close. Specifically it sets the disk state to
XenbusStateClosing and sends the notification to backend - when backend receives the
event, it disconnects the vbd from real device, and sets the vbd device state to
XenbusStateClosing. The backend disconnects the real device/file, and any IO
requests to the disk in guest will end up in ether, leaving disk DEAD and set to
XenbusStateClosing. When the guest wants to disconnect the disk, umount will
hang on blkif_release()->xlvbd_release_gendisk() as it is unable to send any IO
to the disk, which prevents clean system shutdown.

Solution:
Don't disconnect backend until frontend state switched to XenbusStateClosed.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
[v1: Modified description a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-08-22 11:35:35 -04:00
Lukas Czerner dfaa2ef68e loop: add discard support for loop devices
This commit adds discard support for loop devices. Discard is usually
supported by SSD and thinly provisioned devices as a method for
reclaiming unused space. This is no different than trying to reclaim
back space which is not used by the file system on the image, but it
still occupies space on the host file system.

We can do the reclamation on file system which does support hole
punching. So when discard request gets to the loop driver we can
translate that to punch a hole to the underlying file, hence reclaim
the free space.

This is very useful for trimming down the size of the image to only what
is really used by the file system on that image. Fstrim may be used for
that purpose.

It has been tested on ext4, xfs and btrfs with the image file systems
ext4, ext3, xfs and btrfs. ext4, or ext6 image on ext4 file system has
some problems but it seems that ext4 punch hole implementation is
somewhat flawed and it is unrelated to this commit.

Also this is a very good method of validating file systems punch hole
implementation.

Note that when encryption is used, discard support is disabled, because
using it might leak some information useful for possible attacker.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-19 14:50:46 +02:00
Andrew Morton 548ef6cc26 nbd-replace-some-printk-with-dev_warn-and-dev_info-checkpatch-fixes
ERROR: code indent should use tabs where possible
#30: FILE: drivers/block/nbd.c:578:
+^I        dev_info(disk_to_dev(lo->disk), "NBD_DISCONNECT\n");$

total: 1 errors, 0 warnings, 35 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/nbd-replace-some-printk-with-dev_warn-and-dev_info.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-19 14:48:28 +02:00
WANG Cong 5eedf5415c nbd: replace some printk with dev_warn() and dev_info()
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-19 14:48:28 +02:00
WANG Cong 7742ce4ab4 nbd: lower the loglevel of an error message
This is only an error, no need to use KERN_CRIT log level.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-19 14:48:28 +02:00
WANG Cong 7f1b90f99a nbd: replace printk KERN_ERR with dev_err()
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-19 14:48:22 +02:00
WANG Cong 1695b87f7d nbd: replace sysfs_create_file() with device_create_file()
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-19 14:48:21 +02:00
WANG Cong 25ac0c2b97 nbd: use task_pid_nr() to get current pid
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-19 14:48:17 +02:00
Jens Axboe 40bb96ade4 Merge branch 'stable/for-jens' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus 2011-08-09 20:43:26 +02:00
Konrad Rzeszutek Wilk ea5e116162 xen/blkback: Make description more obvious.
With the frontend having Xen but the backend not, it just looks odd:

  <*>   Xen virtual block device support
  <*>   Block-device backend driver

Fix it to have the 'Xen' in front of it.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-08-09 11:12:14 -04:00
Joe Handzik f963d270cb cciss: add transport mode attribute to sys
Signed-off-by: Joseph Handzik <joseph.t.handzik@beardog.cce.hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-08 11:40:17 +02:00
Joseph Handzik 1304953700 cciss: Adds simple mode functionality
Signed-off-by: Joseph Handzik <joseph.t.handzik@beardog.cce.hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-08 11:40:15 +02:00
Axel Lin f41c53a569 block: swim3: fix unterminated of_device_id table
of_device_id structures need a NULL terminating entry, add it.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-03 15:02:55 +02:00
H Hartley Sweeten ddad9ef582 drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse
The buffer 'sc.cpu_mask' is a kernel buffer.  If bitmap_parse is used
instead of __bitmap_parse the extra parameter that indicates a kernel
buffer is not needed.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-02 12:43:49 +02:00
Kay Sievers 05eb0f252b loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other
LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
waits for show() to finish to remove the sysfs file.

  cat /sys/class/block/loop0/loop/backing_file
    mutex_lock_nested+0x176/0x350
    ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    ? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
    dev_attr_show+0x1b/0x60
    ? sysfs_read_file+0x86/0x1a0
    ? __get_free_pages+0x12/0x50
    sysfs_read_file+0xaf/0x1a0

  ioctl(LOOP_CLR_FD):
    wait_for_common+0x12c/0x180
    ? try_to_wake_up+0x2a0/0x2a0
    wait_for_completion+0x18/0x20
    sysfs_deactivate+0x178/0x180
    ? sysfs_addrm_finish+0x43/0x70
    ? sysfs_addrm_start+0x1d/0x20
    sysfs_addrm_finish+0x43/0x70
    sysfs_hash_and_remove+0x85/0xa0
    sysfs_remove_group+0x59/0x100
    loop_clr_fd+0x1dc/0x3f0 [loop]
    lo_ioctl+0x223/0x7a0 [loop]

Instead of taking the lo_ctl_mutex from sysfs code, take the inner
lo->lo_lock, to protect the access to the backing_file data.

Thanks to Tejun for help debugging and finding a solution.

Cc: Milan Broz <mbroz@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-31 22:21:35 +02:00
Kay Sievers d134b00b9a loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices
Instead of unconditionally creating a fixed number of dead loop
devices which need to be investigated by storage handling services,
even when they are never used, we allow distros start with 0
loop devices and have losetup(8) and similar switch to the dynamic
/dev/loop-control interface instead of searching /dev/loop%i for free
devices.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-31 22:08:04 +02:00
Kay Sievers 770fe30a46 loop: add management interface for on-demand device allocation
Loop devices today have a fixed pre-allocated number of usually 8.
The number can only be changed at module init time. To find a free
device to use, /dev/loop%i needs to be scanned, and all devices need
to be opened until a free one is possibly found.

This adds a new /dev/loop-control device node, that allows to
dynamically find or allocate a free device, and to add and remove loop
devices from the running system:
 LOOP_CTL_ADD adds a specific device. Arg is the number
 of the device. It returns the device i or a negative
 error code.

 LOOP_CTL_REMOVE removes a specific device, Arg is the
 number the device. It returns the device i or a negative
 error code.

 LOOP_CTL_GET_FREE finds the next unbound device or allocates
 a new one. No arg is given. It returns the device i or a
 negative error code.

The loop kernel module gets automatically loaded when
/dev/loop-control is accessed the first time. The alias
specified in the module, instructs udev to create this
'dead' device node, even when the module is not loaded.

Example:
 cfd = open("/dev/loop-control", O_RDWR);

 # add a new specific loop device
 err = ioctl(cfd, LOOP_CTL_ADD, devnr);

 # remove a specific loop device
 err = ioctl(cfd, LOOP_CTL_REMOVE, devnr);

 # find or allocate a free loop device to use
 devnr = ioctl(cfd, LOOP_CTL_GET_FREE);

 sprintf(loopname, "/dev/loop%i", devnr);
 ffd = open("backing-file", O_RDWR);
 lfd = open(loopname, O_RDWR);
 err = ioctl(lfd, LOOP_SET_FD, ffd);

Cc: Tejun Heo <tj@kernel.org>
Cc: Karel Zak  <kzak@redhat.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-31 22:08:04 +02:00
Kay Sievers 34dd82afd2 loop: replace linked list of allocated devices with an idr index
Replace the linked list, that keeps track of allocated devices, with an
idr index to allow a more efficient lookup of devices.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-31 22:08:04 +02:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Linus Torvalds ba5b56cb3e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
  ceph: document unlocked d_parent accesses
  ceph: explicitly reference rename old_dentry parent dir in request
  ceph: document locking for ceph_set_dentry_offset
  ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug
  ceph: protect d_parent access in ceph_d_revalidate
  ceph: protect access to d_parent
  ceph: handle racing calls to ceph_init_dentry
  ceph: set dir complete frag after adding capability
  rbd: set blk_queue request sizes to object size
  ceph: set up readahead size when rsize is not passed
  rbd: cancel watch request when releasing the device
  ceph: ignore lease mask
  ceph: fix ceph_lookup_open intent usage
  ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC
  ceph: fix bad parent_inode calc in ceph_lookup_open
  ceph: avoid carrying Fw cap during write into page cache
  libceph: don't time out osd requests that haven't been received
  ceph: report f_bfree based on kb_avail rather than diffing.
  ceph: only queue capsnap if caps are dirty
  ceph: fix snap writeback when racing with writes
  ...
2011-07-26 13:38:50 -07:00
Josh Durgin 029bcbd8b0 rbd: set blk_queue request sizes to object size
This improves performance since more requests can be merged.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-07-26 11:29:35 -07:00
Yehuda Sadeh 79e3057c4c rbd: cancel watch request when releasing the device
We were missing this cleanup, so when a device was released
the osd didn't clean up its watchers list, so following notifications
could be slow as osd needed to timeout on the client.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2011-07-26 11:29:04 -07:00
Linus Torvalds 8ded371f81 Merge branch 'for-3.1/drivers' of git://git.kernel.dk/linux-block
* 'for-3.1/drivers' of git://git.kernel.dk/linux-block:
  cciss: do not attempt to read from a write-only register
  xen/blkback: Add module alias for autoloading
  xen/blkback: Don't let in-flight requests defer pending ones.
  bsg: fix address space warning from sparse
  bsg: remove unnecessary conditional expressions
  bsg: fix bsg_poll() to return POLLOUT properly
2011-07-25 10:38:18 -07:00
Linus Torvalds bbd9d6f7fb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
  vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
  isofs: Remove global fs lock
  jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
  fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
  mm/truncate.c: fix build for CONFIG_BLOCK not enabled
  fs:update the NOTE of the file_operations structure
  Remove dead code in dget_parent()
  AFS: Fix silly characters in a comment
  switch d_add_ci() to d_splice_alias() in "found negative" case as well
  simplify gfs2_lookup()
  jfs_lookup(): don't bother with . or ..
  get rid of useless dget_parent() in btrfs rename() and link()
  get rid of useless dget_parent() in fs/btrfs/ioctl.c
  fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
  drivers: fix up various ->llseek() implementations
  fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
  Ext4: handle SEEK_HOLE/SEEK_DATA generically
  Btrfs: implement our own ->llseek
  fs: add SEEK_HOLE and SEEK_DATA flags
  reiserfs: make reiserfs default to barrier=flush
  ...

Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
shrinker callout for the inode cache, that clashed with the xfs code to
start the periodic workers later.
2011-07-22 19:02:39 -07:00
Linus Torvalds a99a7d1436 Merge branch 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  mips: Fix i8253 clockevent fallout
  i8253: Cleanup outb/inb magic
  arm: Footbridge: Use common i8253 clockevent
  mips: Use common i8253 clockevent
  x86: Use common i8253 clockevent
  i8253: Create common clockevent implementation
  i8253: Export i8253_lock unconditionally
  pcpskr: MIPS: Make config dependencies finer grained
  pcspkr: Cleanup Kconfig dependencies
  i8253: Move remaining content and delete asm/i8253.h
  i8253: Consolidate definitions of PIT_LATCH
  x86: i8253: Consolidate definitions of global_clock_event
  i8253: Alpha, PowerPC: Remove unused asm/8253pit.h
  alpha: i8253: Cleanup remaining users of i8253pit.h
  i8253: Remove I8253_LOCK config
  i8253: Make pcsp sound driver use the shared i8253_lock
  i8253: Make pcspkr input driver use the shared i8253_lock
  i8253: Consolidate all kernel definitions of i8253_lock
  i8253: Unify all kernel declarations of i8253_lock
  i8253: Create linux/i8253.h and use it in all 8253 related files
2011-07-22 16:51:56 -07:00
Linus Torvalds 8181780c16 Merge branch 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6:
  dt: include linux/errno.h in linux/of_address.h
  of/address: Add of_find_matching_node_by_address helper
  dt: remove extra xsysace platform_driver registration
  tty/serial: Add devicetree support for nVidia Tegra serial ports
  dt: add empty of_property_read_u32[_array] for non-dt
  dt: bindings: move SEC node under new crypto/
  dt: add helper function to read u32 arrays
  tty/serial: change of_serial to use new of_property_read_u32() api
  dt: add 'const' for of_property_read_string parameter **out_string
  dt: add helper functions to read u32 and string property values
  tty: of_serial: support for 32 bit accesses
  dt: document the of_serial bindings
  dt/platform: allow device name to be overridden
  drivers/amba: create devices from device tree
  dt: add of_platform_populate() for creating device from the device tree
  dt: Add default match table for bus ids
2011-07-22 14:53:38 -07:00
Linus Torvalds 111ad119d1 Merge branch 'stable/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pciback: Have 'passthrough' option instead of XEN_PCIDEV_BACKEND_PASS and XEN_PCIDEV_BACKEND_VPCI
  xen/pciback: Remove the DEBUG option.
  xen/pciback: Drop two backends, squash and cleanup some code.
  xen/pciback: Print out the MSI/MSI-X (PIRQ) values
  xen/pciback: Don't setup an fake IRQ handler for SR-IOV devices.
  xen: rename pciback module to xen-pciback.
  xen/pciback: Fine-grain the spinlocks and fix BUG: scheduling while atomic cases.
  xen/pciback: Allocate IRQ handler for device that is shared with guest.
  xen/pciback: Disable MSI/MSI-X when reseting a device
  xen/pciback: guest SR-IOV support for PV guest
  xen/pciback: Register the owner (domain) of the PCI device.
  xen/pciback: Cleanup the driver based on checkpatch warnings and errors.
  xen/pciback: xen pci backend driver.
  xen: tmem: self-ballooning and frontswap-selfshrinking
  xen: Add module alias to autoload backend drivers
  xen: Populate xenbus device attributes
  xen: Add __attribute__((format(printf... where appropriate
  xen: prepare tmem shim to handle frontswap
  xen: allow enable use of VGA console on dom0
2011-07-22 13:45:15 -07:00
Al Viro e7f5909707 kill useless checks for sb->s_op == NULL
never is...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:44:21 -04:00
Grant Likely 8c11642a50 Merge commit 'v3.0-rc7' into devicetree/next 2011-07-15 20:11:34 -06:00
Stefan Bader 89153b5cae xen-blkfront: Fix one off warning about name clash
Avoid telling users to use xvde and onwards when using xvde.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-07-14 14:19:51 -04:00
Stefan Bader 196cfe2ae8 xen-blkfront: Drop name and minor adjustments for emulated scsi devices
These were intended to avoid the namespace clash when representing
emulated IDE and SCSI devices. However that seems to confuse users
more than expected (a disk defined as sda becomes xvde).
So for now go back to the scheme which does no adjustments. This
will break when mixing IDE and SCSI names in the configuration of
guests but should be by now expected.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-07-14 14:19:33 -04:00
Grant Likely 5d10302f46 dt: remove extra xsysace platform_driver registration
After commit 1c48a5c93, "dt: Eliminate
of_platform_{,un}register_driver", the xsysace driver attempts to
register two platform_drivers with the same name, which a) doesn't
work, and b) isn't necessary.  This patch merges the two
platform_drivers.

Reported-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-14 05:33:52 -06:00
Stephen M. Cameron 07d0c38e7d cciss: do not attempt to read from a write-only register
Most smartarrays will tolerate it, but some new ones don't.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>

Note: this is a regression caused by commit 1ddd5049
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-09 09:04:12 +02:00
Bastian Blank a7e9357f10 xen/blkback: Add module alias for autoloading
Add xen-backend:vbd module alias to the xen-blkback module. This allows
automatic loading of the module.

Signed-off-by: Bastian Blank <waldi@debian.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-30 12:48:25 -04:00
Daniel Stodden b4726a9df2 xen/blkback: Don't let in-flight requests defer pending ones.
Running RING_FINAL_CHECK_FOR_REQUESTS from make_response is a bad
idea. It means that in-flight I/O is essentially blocking continued
batches. This essentially kills throughput on frontends which unplug
(or even just notify) early and rightfully assume addtional requests
will be picked up on time, not synchronously.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
[v1: Rebased and fixed compile problems]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-30 12:48:06 -04:00
Joe Perches 08b8bfc1c6 xen: Add __attribute__((format(printf... where appropriate
Use the compiler to verify printf formats and arguments.

Fix fallout.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-30 12:14:40 -04:00
Jens Axboe 7b28afe01a Merge branch 'for-3.0-important' of git://git.drbd.org/linux-2.6-drbd into for-linus 2011-06-30 10:10:50 +02:00
Lars Ellenberg 86e1e98e5c drbd: we should write meta data updates with FLUSH FUA
We used to write these with BIO_RW_BARRIER aka REQ_HARDBARRIER (unless
disabled in the configuration). The correct semantic now would be to
write with FLUSH/FUA.
For example, with activity log transactions, FUA alone is not enough, we
need the corresponding bitmap update (and all related application
updates) on stable storage as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-06-30 09:23:46 +02:00
Lars Ellenberg cb6518cbef drbd: when receive times out on meta socket, also check last receive time on data socket
If we have an asymetrically congested network, we may send P_PING,
but due to congestion, the corresponding P_PING_ACK would time out,
and we would drop a (congested, but otherwise) healthy connection
("PingAck did not arrive in time.")

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-06-30 09:23:44 +02:00
Lars Ellenberg 5a8b424276 drbd: account bitmap IO during resync as resync-(related-)-io
If we have a good resync rate, we will frequently update the on-disk
bitmap, which, if not accounted for as resync io, may let an otherwise
idle device appear to be "busy", and cause us to throttle resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-06-30 09:23:43 +02:00
Lars Ellenberg 8ccee20e3e drbd: don't cond_resched_lock with IRQs disabled
The last commit, drbd: add missing spinlock to bitmap receive,
introduced a cond_resched_lock(), where the lock in question is taken
with irqs disabled.

As we must not schedule with IRQs disabled,
and cond_resched_lock_irq() does not exist, yet,
we re-aquire the spin_lock_irq() for each bitmap page processed in turn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-06-30 09:23:42 +02:00
Lars Ellenberg 829c608786 drbd: add missing spinlock to bitmap receive
During bitmap exchange, when using the RLE bitmap compression scheme,
we have a code path that can set the whole bitmap at once.

To avoid holding spin_lock_irq() for too long, we used to lock out other
bitmap modifications during bitmap exchange by other means, and then,
knowing we have exclusive access to the bitmap, modify it without
the spinlock, and with IRQs enabled.

Since we now allow local IO to continue, potentially setting additional
bits during the bitmap receive phase, this is no longer true, and we get
uncoordinated updates of bitmap members, causing bm_set to no longer
accurately reflect the total number of set bits.

To actually see this, you'd need to have a large bitmap, use RLE bitmap
compression, and have busy IO during sync handshake and bitmap exchange.

Fix this by taking the spin_lock_irq() in this code path as well, but
calling cond_resched_lock() after each page worth of bits processed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-06-30 09:23:41 +02:00
Philipp Reisner 0cfdd247d1 drbd: Use the correct max_bio_size when creating resync requests
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-06-30 09:23:40 +02:00
Ralf Baechle 334955ef96 i8253: Create linux/i8253.h and use it in all 8253 related files
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20110601180610.054254048@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

 arch/arm/mach-footbridge/isa-timer.c |    2 +-
 arch/mips/cobalt/time.c              |    2 +-
 arch/mips/jazz/irq.c                 |    2 +-
 arch/mips/kernel/i8253.c             |    2 +-
 arch/mips/mti-malta/malta-time.c     |    2 +-
 arch/mips/sgi-ip22/ip22-time.c       |    2 +-
 arch/mips/sni/time.c                 |    2 +-
 arch/x86/kernel/apic/apic.c          |    2 +-
 arch/x86/kernel/apm_32.c             |    2 +-
 arch/x86/kernel/hpet.c               |    2 +-
 arch/x86/kernel/i8253.c              |    2 +-
 arch/x86/kernel/time.c               |    2 +-
 drivers/block/hd.c                   |    2 +-
 drivers/clocksource/i8253.c          |    2 +-
 drivers/input/gameport/gameport.c    |    2 +-
 drivers/input/joystick/analog.c      |    2 +-
 drivers/input/misc/pcspkr.c          |    2 +-
 include/linux/i8253.h                |   11 +++++++++++
 sound/drivers/pcsp/pcsp.h            |    2 +-
 19 files changed, 29 insertions(+), 18 deletions(-)
2011-06-09 15:01:37 +02:00
Linus Torvalds 4f1ba49efa Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  block: Use hlist_entry() for io_context.cic_list.first
  cfq-iosched: Remove bogus check in queue_fail path
  xen/blkback: potential null dereference in error handling
  xen/blkback: don't call vbd_size() if bd_disk is NULL
  block: blkdev_get() should access ->bd_disk only after success
  CFQ: Fix typo and remove unnecessary semicolon
  block: remove unwanted semicolons
  Revert "block: Remove extra discard_alignment from hd_struct."
  nbd: adjust 'max_part' according to part_shift
  nbd: limit module parameters to a sane value
  nbd: pass MSG_* flags to kernel_recvmsg()
  block: improve the bio_add_page() and bio_add_pc_page() descriptions
2011-06-04 08:11:26 +09:00
Linus Torvalds 0f48f26009 block: fix mismerge of the DISK_EVENT_MEDIA_CHANGE removal
Jens' back-merge commit 698567f3fa ("Merge commit 'v2.6.39' into
for-2.6.40/core") was incorrectly done, and re-introduced the
DISK_EVENT_MEDIA_CHANGE lines that had been removed earlier in commits

 - 9fd097b149 ("block: unexport DISK_EVENT_MEDIA_CHANGE for
   legacy/fringe drivers")

 - 7eec77a181 ("ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd
   and ide-cd")

because of conflicts with the "g->flags" updates near-by by commit
d4dc210f69 ("block: don't block events on excl write for non-optical
devices")

As a result, we re-introduced the hanging behavior due to infinite disk
media change reports.

Tssk, tssk, people! Don't do back-merges at all, and *definitely* don't
do them to hide merge conflicts from me - especially as I'm likely
better at merging them than you are, since I do so many merges.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-02 05:29:19 +09:00
Dan Carpenter 9b83c77121 xen/blkback: potential null dereference in error handling
blkbk->pending_pages can be NULL here so I added a check for it.

Signed-off-by: Dan Carpenter <error27@gmail.com>
[v1: Redid the loop a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-01 09:28:21 -04:00
Laszlo Ersek 6464920a6e xen/blkback: don't call vbd_size() if bd_disk is NULL
...because vbd_size() dereferences bd_disk if bd_part is NULL.

Signed-off-by: Laszlo Ersek<lersek@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-01 09:28:20 -04:00
Liu Yuan 6917f83ffe drivers, block: virtio_blk: Replace cryptic number with the macro
It is easier to figure out the context by reading SCSI_SENSE_BUFFERSIZE
instead of plain '96'.

Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:13 +09:30
Christoph Hellwig 7a7c924cf0 virtio_blk: allow re-reading config space at runtime
Wire up the virtio_driver config_changed method to get notified about
config changes raised by the host.  For now we just re-read the device
size to support online resizing of devices, but once we add more
attributes that might be changeable they could be added as well.

Note that the config_changed method is called from irq context, so
we'll have to use the workqueue infrastructure to provide us a proper
user context for our changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:13 +09:30
Linus Torvalds f310642123 Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param
  x86 idle: deprecate "no-hlt" cmdline param
  x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
  x86 idle floppy: deprecate disable_hlt()
  x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
  x86 idle: clarify AMD erratum 400 workaround
  idle governor: Avoid lock acquisition to read pm_qos before entering idle
  cpuidle: menu: fixed wrapping timers at 4.294 seconds
2011-05-29 11:18:09 -07:00
Len Brown 3b70b2e5fc x86 idle floppy: deprecate disable_hlt()
Plan to remove floppy_disable_hlt in 2012, an ancient
workaround with comments that it should be removed.

This allows us to remove clutter and a run-time branch
from the idle code.

WARN_ONCE() on invocation until it is removed.

cc: x86@kernel.org
cc: stable@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:15 -04:00
Namhyung Kim 5988ce2396 nbd: adjust 'max_part' according to part_shift
The 'max_part' parameter determines how many partitions are supported
on each nbd device. However the actual number can be changed to the
power of 2 minus 1 form during the module initialization as
alloc_disk() is called with (1 << part_shift) for some reason.

So adjust 'max_part' also at least for consistency with loop and brd.
It is exported via sysfs already, and a user should check this value
after module loading if [s]he wants to use that number correctly
(i.e. fdisk or something).

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-28 14:44:46 +02:00
Namhyung Kim 3b2710824e nbd: limit module parameters to a sane value
The 'max_part' parameter controls the number of maximum partition
a nbd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel oops (or, at least, produce invalid device
nodes in some cases).

In addition, specifying large 'nbds_max' value causes same
problem for the same reason.

On my desktop, following command results to the kernel bug:

$ sudo modprobe nbd max_part=100000
 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
 invalid opcode: 0000 [#1] SMP
 last sysfs file: /sys/devices/virtual/block/nbd4/range
 CPU 1
 Modules linked in: nbd(+) bridge stp llc kvm_intel kvm asus_atk0110 sg sr_mod cdrom

 Pid: 2522, comm: modprobe Tainted: G        W   2.6.39-leonard+ #159 System manufacturer System Product Name/P5G41TD-M PRO
 RIP: 0010:[<ffffffff8115aa08>]  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
 RSP: 0018:ffff8801009f1de8  EFLAGS: 00010246
 RAX: 00000000ffffffef RBX: ffff880103920478 RCX: 00000000000a7bd3
 RDX: ffffffff81a2dbe0 RSI: 0000000000000000 RDI: ffff880103920478
 RBP: ffff8801009f1e38 R08: ffff880103920468 R09: ffff880103920478
 R10: ffff8801009f1de8 R11: ffff88011eccbb68 R12: ffffffff81a2dbe0
 R13: ffff880103920468 R14: 0000000000000000 R15: ffff880103920400
 FS:  00007f3c49de9700(0000) GS:ffff88011f800000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 00007f3b7fe7c000 CR3: 00000000cd58d000 CR4: 00000000000406e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process modprobe (pid: 2522, threadinfo ffff8801009f0000, task ffff8801009a93a0)
 Stack:
  ffff8801009f1e58 ffffffff812e8f6e ffff8801009f1e58 ffffffff812e7a80
  ffff880000000010 ffff880103920400 ffff8801002fd0c0 ffff880103920468
  0000000000000011 ffff880103920400 ffff8801009f1e48 ffffffff8115ab6a
 Call Trace:
  [<ffffffff812e8f6e>] ? device_add+0x4f1/0x5e4
  [<ffffffff812e7a80>] ? dev_set_name+0x41/0x43
  [<ffffffff8115ab6a>] sysfs_create_group+0x13/0x15
  [<ffffffff810b857e>] blk_trace_init_sysfs+0x14/0x16
  [<ffffffff811ee58b>] blk_register_queue+0x4c/0xfd
  [<ffffffff811f3bdf>] add_disk+0xe4/0x29c
  [<ffffffffa007e2ab>] nbd_init+0x2ab/0x30d [nbd]
  [<ffffffffa007e000>] ? 0xffffffffa007dfff
  [<ffffffff8100020f>] do_one_initcall+0x7f/0x13e
  [<ffffffff8107ab0a>] sys_init_module+0xa1/0x1e3
  [<ffffffff814f3542>] system_call_fastpath+0x16/0x1b
 Code: 41 57 41 56 41 55 41 54 53 48 83 ec 28 0f 1f 44 00 00 48 89 fb 41 89 f6 49 89 d4 48 85 ff 74 0b 85 f6 75 0b 48 83
  7f 30 00 75 14 <0f> 0b eb fe b9 ea ff ff ff 48 83 7f 30 00 0f 84 09 01 00 00 49
 RIP  [<ffffffff8115aa08>] internal_create_group+0x2f/0x166
  RSP <ffff8801009f1de8>
 ---[ end trace 753285ffbf72c57c ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-28 14:44:46 +02:00
Namhyung Kim 35fbf5bcf4 nbd: pass MSG_* flags to kernel_recvmsg()
Unlike kernel_sendmsg(), kernel_recvmsg() requires passing flags explicitly
via last parameter instead of struct msghdr.msg_flags. Therefore calls to
sock_xmit(lo, 0, ..., MSG_WAITALL) have not been processed properly by tcp
layer wrt. the flag. Fix it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-28 14:44:46 +02:00
Namhyung Kim ac04fee0b5 loop: export module parameters
Export 'max_loop' and 'max_part' parameters to sysfs so user can know
that how many devices are allowed and how many partitions are supported.

If 'max_loop' is 0, there is no restriction on the number of loop devices.
User can create/use the devices as many as minor numbers available. If
'max_part' is 0, it means simply the device doesn't support partitioning.

Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
needed. User should check this value after the module loading if he/she
want to use that number correctly (i.e. fdisk, mknod, etc.).

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-27 07:59:25 +02:00
Namhyung Kim 8892cbaf68 brd: export module parameters
Export 'rd_nr', 'rd_size' and 'max_part' parameters to sysfs so user can
know that how many devices are allowed, how big each device is and how
many partitions are supported. If 'max_part' is 0, it means simply the
device doesn't support partitioning.

Also note that 'max_part' can be adjusted to power of 2 minus 1 form if
needed. User should check this value after the module loading if he/she
want to use that number correctly (i.e. fdisk, mknod, etc.).

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-26 21:06:50 +02:00
Namhyung Kim 13868b76ab brd: fix comment on initial device creation
If 'rd_nr' param was not specified, 16 (can be adjusted via
CONFIG_BLK_DEV_RAM_COUNT) devices would be created by default
but comment said 1. Fix it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-26 21:06:50 +02:00
Namhyung Kim af46566885 brd: handle on-demand devices correctly
When finding or allocating a ram disk device, brd_probe() did not take
partition numbers into account so that it can result to a different
device. Consider following example (I set CONFIG_BLK_DEV_RAM_COUNT=4
for simplicity) :

$ sudo modprobe brd max_part=15
$ ls -l /dev/ram*
brw-rw---- 1 root disk 1,  0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1, 16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1, 32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1, 48 2011-05-25 15:41 /dev/ram3
$ sudo mknod /dev/ram4 b 1 64
$ sudo dd if=/dev/zero of=/dev/ram4 bs=4k count=256
256+0 records in
256+0 records out
1048576 bytes (1.0 MB) copied, 0.00215578 s, 486 MB/s
namhyung@leonhard:linux$ ls -l /dev/ram*
brw-rw---- 1 root disk 1,    0 2011-05-25 15:41 /dev/ram0
brw-rw---- 1 root disk 1,   16 2011-05-25 15:41 /dev/ram1
brw-rw---- 1 root disk 1,   32 2011-05-25 15:41 /dev/ram2
brw-rw---- 1 root disk 1,   48 2011-05-25 15:41 /dev/ram3
brw-r--r-- 1 root root 1,   64 2011-05-25 15:45 /dev/ram4
brw-rw---- 1 root disk 1, 1024 2011-05-25 15:44 /dev/ram64

After this patch, /dev/ram4 - instead of /dev/ram64 - was
accessed correctly.

In addition, 'range' passed to blk_register_region() should
include all range of dev_t that RAMDISK_MAJOR can address.
It does not need to be limited by partition numbers unless
'rd_nr' param was specified.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-26 21:06:50 +02:00
Namhyung Kim 315980c868 brd: limit 'max_part' module param to DISK_MAX_PARTS
The 'max_part' parameter controls the number of maximum partition
a brd device can have. However if a user specifies very large
value it would exceed the limitation of device minor number and
can cause a kernel panic (or, at least, produce invalid device
nodes in some cases).

On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:

$ sudo modprobe brd max_part=100000
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
 IP: [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
 PGD 7af1067 PUD 7b19067 PMD 0
 Oops: 0000 [#1] SMP
 last sysfs file:
 CPU 0
 Modules linked in: brd(+)

 Pid: 44, comm: insmod Tainted: G        W   2.6.39-qemu+ #158 Bochs Bochs
 RIP: 0010:[<ffffffff81110a9a>]  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
 RSP: 0018:ffff880007b15d78  EFLAGS: 00000286
 RAX: ffff880007b05478 RBX: ffff880007a52760 RCX: ffff880007b15dc8
 RDX: ffff880007a4f900 RSI: ffff880007b15e48 RDI: ffff880007a52760
 RBP: ffff880007b15da8 R08: 0000000000000002 R09: 0000000000000000
 R10: ffff880007b15e48 R11: ffff880007b05478 R12: 0000000000000000
 R13: ffff880007b05478 R14: 0000000000400920 R15: 0000000000000063
 FS:  0000000002160880(0063) GS:ffff880007c00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000058 CR3: 0000000007b1c000 CR4: 00000000000006b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
 Process insmod (pid: 44, threadinfo ffff880007b14000, task ffff880007acb980)
 Stack:
  ffff880007b15dc8 ffff880007b05478 ffff880007b15da8 00000000fffffffe
  ffff880007a52760 ffff880007b05478 ffff880007b15de8 ffffffff81143c0a
  0000000000400920 ffff880007a52760 ffff880007b05478 0000000000000000
 Call Trace:
  [<ffffffff81143c0a>] kobject_add_internal+0xdf/0x1a0
  [<ffffffff81143da1>] kobject_add_varg+0x41/0x50
  [<ffffffff81143e6b>] kobject_add+0x64/0x66
  [<ffffffff8113bbe7>] blk_register_queue+0x5f/0xb8
  [<ffffffff81140f72>] add_disk+0xdf/0x289
  [<ffffffffa00040df>] brd_init+0xdf/0x1aa [brd]
  [<ffffffffa0004000>] ? 0xffffffffa0003fff
  [<ffffffffa0004000>] ? 0xffffffffa0003fff
  [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
  [<ffffffff8108516c>] sys_init_module+0x9c/0x1dc
  [<ffffffff812ff4bb>] system_call_fastpath+0x16/0x1b
 Code: 89 e5 41 55 41 54 53 48 89 fb 48 83 ec 18 48 85 ff 75 04 0f 0b eb fe 48 8b 47 18 49 c7 c4 70 1e 4d 81 48 85 c0 74 04 4c 8b 60 30
  8b 44 24 58 45 31 ed 0f b6 c4 85 c0 74 0d 48 8b 43 28 48 89
 RIP  [<ffffffff81110a9a>] sysfs_create_dir+0x2d/0xae
  RSP <ffff880007b15d78>
 CR2: 0000000000000058
 ---[ end trace aebb1175ce1f6739 ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-26 21:06:50 +02:00
Namhyung Kim a2cba2913c brd: get rid of unused members from struct brd_device
brd_refcnt, brd_offset, brd_sizelimit and brd_blocksize in struct
brd_device seem to be copied from struct loop_device but they're
not used anywhere. Let get rid of them.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-26 21:06:50 +02:00
Linus Torvalds 57bb559574 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
  ceph: fix cap flush race reentrancy
  libceph: subscribe to osdmap when cluster is full
  libceph: handle new osdmap down/state change encoding
  rbd: handle online resize of underlying rbd image
  ceph: avoid inode lookup on nfs fh reconnect
  ceph: use LOOKUPINO to make unconnected nfs fh more reliable
  rbd: use snprintf for disk->disk_name
  rbd: cleanup: make kfree match kmalloc
  rbd: warn on update_snaps failure on notify
  ceph: check return value for start_request in writepages
  ceph: remove useless check
  libceph: add missing breaks in addr_set_port
  libceph: fix TAG_WAIT case
  ceph: fix broken comparison in readdir loop
  libceph: fix osdmap timestamp assignment
  ceph: fix rare potential cap leak
  libceph: use snprintf for unknown addrs
  libceph: use snprintf for formatting object name
  ceph: use snprintf for dirstat content
  libceph: fix uninitialized value when no get_authorizer method is set
  ...
2011-05-25 11:46:31 -07:00
Linus Torvalds 929cfdd5d3 Merge branch 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.40/drivers' of git://git.kernel.dk/linux-2.6-block: (110 commits)
  loop: handle on-demand devices correctly
  loop: limit 'max_part' module param to DISK_MAX_PARTS
  drbd: fix warning
  drbd: fix warning
  drbd: Fix spelling
  drbd: fix schedule in atomic
  drbd: Take a more conservative approach when deciding max_bio_size
  drbd: Fixed state transitions after async outdate-peer-handler returned
  drbd: Disallow the peer_disk_state to be D_OUTDATED while connected
  drbd: Fix for the connection problems on high latency links
  drbd: fix potential activity log refcount imbalance in error path
  drbd: Only downgrade the disk state in case of disk failures
  drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int
  drbd: fix potential distributed deadlock
  lru_cache.h: fix comments referring to ts_ instead of lc_
  drbd: Fix for application IO with the on-io-error=pass-on policy
  xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions.
  xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override.
  xen/blkback: don't fail empty barrier requests
  xen/blkback: fix xenbus_transaction_start() hang caused by double xenbus_transaction_end()
  ...
2011-05-25 09:15:35 -07:00
Linus Torvalds 798ce8f1cc Merge branch 'for-2.6.40/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.40/core' of git://git.kernel.dk/linux-2.6-block: (40 commits)
  cfq-iosched: free cic_index if cfqd allocation fails
  cfq-iosched: remove unused 'group_changed' in cfq_service_tree_add()
  cfq-iosched: reduce bit operations in cfq_choose_req()
  cfq-iosched: algebraic simplification in cfq_prio_to_maxrq()
  blk-cgroup: Initialize ioc->cgroup_changed at ioc creation time
  block: move bd_set_size() above rescan_partitions() in __blkdev_get()
  block: call elv_bio_merged() when merged
  cfq-iosched: Make IO merge related stats per cpu
  cfq-iosched: Fix a memory leak of per cpu stats for root group
  backing-dev: Kill set but not used var in  bdi_debug_stats_show()
  block: get rid of on-stack plugging debug checks
  blk-throttle: Make no throttling rule group processing lockless
  blk-cgroup: Make cgroup stat reset path blkg->lock free for dispatch stats
  blk-cgroup: Make 64bit per cpu stats safe on 32bit arch
  blk-throttle: Make dispatch stats per cpu
  blk-throttle: Free up a group only after one rcu grace period
  blk-throttle: Use helper function to add root throtl group to lists
  blk-throttle: Introduce a helper function to fill in device details
  blk-throttle: Dynamically allocate root group
  blk-cgroup: Allow sleeping while dynamically allocating a group
  ...
2011-05-25 09:14:07 -07:00
Sage Weil 9db4b3e327 rbd: handle online resize of underlying rbd image
If we get a notification that the image header has changed, check for
a change in the image size.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-24 11:52:08 -07:00
Sage Weil aedfec59ee rbd: use snprintf for disk->disk_name
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-24 11:52:03 -07:00
Sage Weil 916d4d6727 rbd: cleanup: make kfree match kmalloc
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-24 11:52:01 -07:00
Namhyung Kim a1c15c59fe loop: handle on-demand devices correctly
When finding or allocating a loop device, loop_probe() did not take
partition numbers into account so that it can result to a different
device. Consider following example:

$ sudo modprobe loop max_part=15
$ ls -l /dev/loop*
brw-rw---- 1 root disk 7,   0 2011-05-24 22:16 /dev/loop0
brw-rw---- 1 root disk 7,  16 2011-05-24 22:16 /dev/loop1
brw-rw---- 1 root disk 7,  32 2011-05-24 22:16 /dev/loop2
brw-rw---- 1 root disk 7,  48 2011-05-24 22:16 /dev/loop3
brw-rw---- 1 root disk 7,  64 2011-05-24 22:16 /dev/loop4
brw-rw---- 1 root disk 7,  80 2011-05-24 22:16 /dev/loop5
brw-rw---- 1 root disk 7,  96 2011-05-24 22:16 /dev/loop6
brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7
$ sudo mknod /dev/loop8 b 7 128
$ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img
$ sudo losetup -a
/dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img)
$ ls -l /dev/loop*
brw-rw---- 1 root disk 7,    0 2011-05-24 22:16 /dev/loop0
brw-rw---- 1 root disk 7,   16 2011-05-24 22:16 /dev/loop1
brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128
brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1
brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2
brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3
brw-rw---- 1 root disk 7,   32 2011-05-24 22:16 /dev/loop2
brw-rw---- 1 root disk 7,   48 2011-05-24 22:16 /dev/loop3
brw-rw---- 1 root disk 7,   64 2011-05-24 22:16 /dev/loop4
brw-rw---- 1 root disk 7,   80 2011-05-24 22:16 /dev/loop5
brw-rw---- 1 root disk 7,   96 2011-05-24 22:16 /dev/loop6
brw-rw---- 1 root disk 7,  112 2011-05-24 22:16 /dev/loop7
brw-r--r-- 1 root root 7,  128 2011-05-24 22:17 /dev/loop8

After this patch, /dev/loop8 - instead of /dev/loop128 - was
accessed correctly.

In addition, 'range' passed to blk_register_region() should
include all range of dev_t that LOOP_MAJOR can address. It does
not need to be limited by partition numbers unless 'max_loop'
param was specified.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24 16:48:55 +02:00
Namhyung Kim 78f4bb367f loop: limit 'max_part' module param to DISK_MAX_PARTS
The 'max_part' parameter controls the number of maximum partition
a loop block device can have. However if a user specifies very
large value it would exceed the limitation of device minor number
and can cause a kernel panic (or, at least, produce invalid
device nodes in some cases).

On my desktop system, following command kills the kernel. On qemu,
it triggers similar oops but the kernel was alive:

$ sudo modprobe loop max_part0000
 ------------[ cut here ]------------
 kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65!
 invalid opcode: 0000 [#1] SMP
 last sysfs file:
 CPU 0
 Modules linked in: loop(+)

 Pid: 43, comm: insmod Tainted: G        W   2.6.39-qemu+ #155 Bochs Bochs
 RIP: 0010:[<ffffffff8113ce61>]  [<ffffffff8113ce61>] internal_create_group=
+0x2a/0x170
 RSP: 0018:ffff880007b3fde8  EFLAGS: 00000246
 RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4
 RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878
 RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000
 R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50
 R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800
 FS:  0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000=
00
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
 Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c=
0)
 Stack:
  ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b
  0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868
  0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8
 Call Trace:
  [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af
  [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e
  [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12
  [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16
  [<ffffffff8116a090>] blk_register_queue+0x47/0xf7
  [<ffffffff8116f527>] add_disk+0xdf/0x290
  [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop]
  [<ffffffffa0006000>] ? 0xffffffffa0005fff
  [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e
  [<ffffffff81096804>] sys_init_module+0x9c/0x1e0
  [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b
 Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb=
 48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe =
48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49
 RIP  [<ffffffff8113ce61>] internal_create_group+0x2a/0x170
  RSP <ffff880007b3fde8>
 ---[ end trace a123eb592043acad ]---

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24 16:48:54 +02:00
Andrew Morton 0ddf72be4e drbd: fix warning
In file included from drivers/block/drbd/drbd_main.c:54:                        drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type

Forward declarations of enums do not work.

Fix it unpleasantly by moving the prototype.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Lars Ellenberg <drbd-dev@lists.linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2011-05-24 10:38:33 +02:00
Philipp Reisner 9b2f61aec7 drbd: fix warning
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24 10:38:32 +02:00
Bart Van Assche 24c4830c8e drbd: Fix spelling
Found these with the help of ispell -l.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24 10:21:29 +02:00
Lars Ellenberg 9a0d9d0389 drbd: fix schedule in atomic
An administrative detach used to request a state change directly to D_DISKLESS,
first suspending IO to avoid the last put_ldev() occuring from an endio handler,
potentially in irq context.

This is not enough on the receiving side (typically secondary), we may miss
some peer_req on the way to local disk, which then may do the last put_ldev()
from their drbd_peer_request_endio().

This patch makes the detach always go through the intermediate D_FAILED state.
We may consider to rename it D_DETACHING.

Alternative approach would be to create yet an other work item to be scheduled
on the worker, do the destructor work from there, and get the timing right.

manually picked commit 564040f from the drbd 8.4 branch.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:14:32 +02:00
Philipp Reisner 99432fcc52 drbd: Take a more conservative approach when deciding max_bio_size
The old (optimistic) implementation could shrink the bio size
on an primary device.

Shrinking the bio size on a primary device is bad. Since there
we might get BIOs with the old (bigger) size shortly after
we published the new size.

The new implementation is more conservative, and eventually
increases the max_bio_size on a primary device (which is valid).
It does so, when it knows the local limit AND the remote limit.

 We cache the last seen max_bio_size of the peer in the meta
 data, and rely on that, to make the operation of single
 nodes more efficient.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:08:58 +02:00
Philipp Reisner 21423fa791 drbd: Fixed state transitions after async outdate-peer-handler returned
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:08:11 +02:00
Philipp Reisner fa7d939663 drbd: Disallow the peer_disk_state to be D_OUTDATED while connected
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:07:50 +02:00
Philipp Reisner a8e407925d drbd: Fix for the connection problems on high latency links
It seems that the real cause of all the issues where that
we did not noticed in drbd_try_connect() when the other
guy closes one socket if the round trip time gets higher
than 100ms. There were that 100ms hard coded!

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:07:22 +02:00
Lars Ellenberg 76727f684a drbd: fix potential activity log refcount imbalance in error path
It is no longer sufficient to trigger on local WRITE,
we need to check on (rq_state & RQ_IN_ACT_LOG)
before calling drbd_al_complete_io also in the error path.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:06:44 +02:00
Philipp Reisner d2e17807e3 drbd: Only downgrade the disk state in case of disk failures
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:05:48 +02:00
Lars Ellenberg f36af18c7b drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int
If there is no replication traffic within the idle timeout
(ping-int seconds), DRBD will send a P_PING,
and adjust the timeout to ping-timeout.

If there is no P_PING_ACK received within this ping-timeout,
DRBD finally drops the connection, and tries to re-establish it.

To decide which timeout was active, we compared the current timeout
with the ping-timeout, and dropped the connection, if that was the case.

By default, ping-int is 10 seconds, ping-timeout is 500 ms.

Unfortunately, if you configure ping-timeout to be the same as ping-int,
expiry of the idle-timeout had been mistaken for a missing ping ack,
and caused an immediate reconnection attempt.

Fix:
Allow both timeouts to be equal, use a local variable
to store which timeout is active.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:03:30 +02:00
Lars Ellenberg 53ea433145 drbd: fix potential distributed deadlock
We limit ourselves to a configurable maximum number of pages used as
temporary bio pages.

If the configured "max_buffers" is not big enough to match the bandwidth
of the respective deployment, a distributed deadlock could be triggered
by e.g. fast online verify and heavy application IO.

TCP connections would block on congestion, because both receivers
would wait on pages to become available.

Fortunately the respective senders in this case would be able to give
back some pages already. So do that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:02:41 +02:00
Philipp Reisner 738a84b25c drbd: Fix for application IO with the on-io-error=pass-on policy
In case a write failes on the local disk, go into D_INCONSISTENT
disk state. That causes future reads of that block to be shipped
to the peer.

Read retry remote was already in place.

Actually the documentation needs to get fixed now. Since the
application is still shielded from the error. (as long as we have
only a single disk failing) The difference to detach is that
we keep the disk. And therefore might keep all the other, still
working sectors up to date.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 09:59:49 +02:00
Paul Gortmaker 70c7160619 Add appropriate <linux/prefetch.h> include for prefetch users
After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout.  Give
them an explicit include via the following $0.02 script.

 =========================================
 #!/bin/bash
 MANUAL=""
 for i in `git grep -l 'prefetch(.*)' .` ; do
 	grep -q '<linux/prefetch.h>' $i
 	if [ $? = 0 ] ; then
 		continue
 	fi

 	(	echo '?^#include <linux/?a'
 		echo '#include <linux/prefetch.h>'
 		echo .
 		echo w
 		echo q
 	) | ed -s $i > /dev/null 2>&1
 	if [ $? != 0 ]; then
 		echo $i needs manual fixup
 		MANUAL="$i $MANUAL"
 	fi
 done
 echo ------------------- 8\<----------------------
 echo vi $MANUAL
 =========================================

Signed-off-by: Paul <paul.gortmaker@windriver.com>
[ Fixed up some incorrect #include placements, and added some
  non-network drivers and the fib_trie.c case    - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-22 21:41:57 -07:00
Jens Axboe 698567f3fa Merge commit 'v2.6.39' into for-2.6.40/core
Since for-2.6.40/core was forked off the 2.6.39 devel tree, we've
had churn in the core area that makes it difficult to handle
patches for eg cfq or blk-throttle. Instead of requiring that they
be based in older versions with bugs that have been fixed later
in the rc cycle, merge in 2.6.39 final.

Also fixes up conflicts in the below files.

Conflicts:
	drivers/block/paride/pcd.c
	drivers/cdrom/viocd.c
	drivers/ide/ide-cd.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-20 20:33:15 +02:00
Sage Weil 13143d2d1c rbd: warn on update_snaps failure on notify
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-19 11:25:05 -07:00
Jens Axboe 779d530632 Merge branches 'for-jens/xen-backend-fixes' and 'for-jens/xen-blkback-v3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-2.6.40/drivers 2011-05-19 09:46:00 +02:00
Jan Beulich 8ab521506c xen/blkback: don't fail empty barrier requests
The sector number on empty barrier requests may (will?) be -1, which,
given that it's being treated as unsigned 64-bit quantity, will almost
always exceed the actual (virtual) disk's size.

Inspired by Konrad's "When writting barriers set the sector number to
zero...".

While at it also add overflow checking to the math in vbd_translate().

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-18 11:28:16 -04:00
Linus Torvalds a2b9c1f620 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: don't delay blk_run_queue_async
  scsi: remove performance regression due to async queue run
  blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup
  block: rescan partitions on invalidated devices on -ENOMEDIA too
  cdrom: always check_disk_change() on open
  block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
2011-05-18 06:49:02 -07:00
Yehuda Sadeh 1fec70932d rbd: fix split bio handling
The rbd driver currently splits bios when they span an object boundary.
However, the blk_end_request expects the completions to roll up the results
in block device order, and the split rbd/ceph ops can complete in any
order.  This patch adds a struct rbd_req_coll to track completion of split
requests and ensures that the results are passed back up to the block layer
in order.

This fixes errors where the file system gets completion of a read operation
that spans an object boundary before the data has actually arrived.  The
bug is easily reproduced with iozone with a working set larger than
available RAM.

Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-13 13:52:57 -07:00
Laszlo Ersek 496b318eb6 xen/blkback: fix xenbus_transaction_start() hang caused by double xenbus_transaction_end()
vbd_resize() up_read()'s xs_state.suspend_mutex twice in a row via double
xenbus_transaction_end() calls. The next down_read() in
xenbus_transaction_start() (at eg. the next resize attempt) hangs.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=618317

Acked-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-13 09:45:40 -04:00
Sage Weil 11f770027b rbd: fix leak of ops struct
The ops vector must be freed by the rbd_do_request caller.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-12 20:59:14 -07:00
Konrad Rzeszutek Wilk 5185432277 xen/blkback: Align the tabs on the structure.
The recent changes caused this field of the structure to be offset a bit.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 18:02:28 -04:00
Konrad Rzeszutek Wilk cca537af7d xen/blkback: if log_stats is enabled print out the data.
And not depend on the driver being built with -DDEBUG flag.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:54 -04:00
Konrad Rzeszutek Wilk 5a577e3872 xen/blkback: Add the prefix XEN in the common.h.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:53 -04:00
Konrad Rzeszutek Wilk 3d814731ba xen/blkback: Prefix 'vbd' with 'xen' in structs and functions.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:52 -04:00
Konrad Rzeszutek Wilk 30fd150202 xen/blkback: Change structure name blkif_st to xen_blkif.
No need for that '_st' and xen_blkif is more apt.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:51 -04:00
Konrad Rzeszutek Wilk 325a648604 xen/blkback: Remove the unused typedefs.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:50 -04:00
Konrad Rzeszutek Wilk 452a6b2bb6 xen/blkback: Move include/xen/blkif.h into drivers/block/xen-blkback/common.h
Not point of the blkif.h file. It is not used by the frontend.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:49 -04:00
Konrad Rzeszutek Wilk b0f801273f xen/blkback: Fixing some more of the cleanpatch.pl warnings.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:48 -04:00
Konrad Rzeszutek Wilk 03e0edf946 xen/blkback: Checkpatch.pl recommend against multiple assigments.
CHECK: multiple assignments should be avoided

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:47 -04:00
Konrad Rzeszutek Wilk a4c348580e xen/blkback: Flesh out the description in the Kconfig.
with more details.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 17:55:40 -04:00
Konrad Rzeszutek Wilk b9fc02968c xen/blkback: Fix spelling mistakes.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 16:43:21 -04:00
Konrad Rzeszutek Wilk 68c88dd7d3 xen/blkback: Move blkif_get_x86_[32|64]_req to common.h in block/xen-blkback dir.
From the blkif.h header, which was exposed to the frontend.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 16:43:20 -04:00
Konrad Rzeszutek Wilk 72468bfcb8 xen/blkback: Removing the debug_lvl option.
It is not really used for anything.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 16:43:20 -04:00
Konrad Rzeszutek Wilk 22b20f2dff xen/blkback: Use the DRV_PFX in the pr_.. macros.
To make it easier to read.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 16:43:12 -04:00
Konrad Rzeszutek Wilk 1afbd730a3 xen/blkback: Make the DPRINTK uniform.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 16:42:51 -04:00
Konrad Rzeszutek Wilk ebe8190659 xen/blkback: Change printk/DPRINTK to pr_.. type variant.
And also make them uniform and prefix the message with 'xen-blkback'.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 16:42:31 -04:00
Konrad Rzeszutek Wilk edf6ef59ec xen-blkfront: Introduce BLKIF_OP_FLUSH_DISKCACHE support.
If the backend supports the 'feature-flush-cache' mode, use that
instead of the 'feature-barrier' support.

Currently there are three backends that support the 'feature-flush-cache'
mode: NetBSD, Solaris and Linux kernel. The 'flush' option is much
light-weight version than the 'barrier' support so lets try to use as
there are no filesystems in the kernel that use full barriers anymore.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 08:56:03 -04:00
Marek Marczykowski 4352b47ab7 xen-blkfront: fix data size for xenbus_gather in blkfront_connect
barrier variable is int, not long. This overflow caused another variable
override: "err" (in PV code) and "binfo" (in xenlinux code -
drivers/xen/blkfront/blkfront.c). The later caused incorrect device
flags (RO/removable etc).

Signed-off-by: Marek Marczykowski <marmarek@mimuw.edu.pl>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
[v1: Changed title]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 08:55:51 -04:00
Konrad Rzeszutek Wilk 01f37f2d53 xen/blkback: Fixed up comments and converted spaces to tabs.
Suggested-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-11 15:57:09 -04:00
Jens Axboe edc83d47a9 cciss: fix compile issue
drivers/block/cciss.c: In function ‘cciss_send_reset’:
drivers/block/cciss.c:2515:2: error: implicit declaration of function ‘fill_cmd’
drivers/block/cciss.c: At top level:
drivers/block/cciss.c:2531:12: error: conflicting types for ‘fill_cmd’
drivers/block/cciss.c:2534:1: note: an argument type that has a default promotion can’t match an empty parameter name list declaration
drivers/block/cciss.c:2515:18: note: previous implicit declaration of ‘fill_cmd’ was here
make[1]: *** [drivers/block/cciss.o] Error 1
make: *** [drivers/block/cciss.o] Error 2

Move fill_cmd() to above where it is first used.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:27:00 -06:00
Stephen M. Cameron 8a4ec67bd5 cciss: add cciss_tape_cmds module paramter
This is to allow number of commands reserved for use by SCSI tape drives
and medium changers to be adjusted at driver load time via the kernel
parameter cciss_tape_cmds, with a default value of 6, and a range
of 2 - 16 inclusive.  Previously, the driver limited the number of
commands which could be queued to the SCSI half of the the driver
to only 2.  This is to fix the problem that if you had more than
two tape drives, you couldn't, for example, erase or rewind them all
at the same time.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:59 -06:00
Stephen M. Cameron 063d2cf72a cciss: do not use bit 2 doorbell reset
It causes NMIs which are undesirable at best, unsurvivable at worst.
Prefer the soft reset instead.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:58 -06:00
Stephen M. Cameron ec52d5f1cb cciss: do not attempt PCI power management reset method if we know it won't work.
Just go straight to the soft-reset method instead.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:57 -06:00
Stephen M. Cameron 93c46c2fa7 cciss: remove superfluous sleeps around reset code
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:56 -06:00
Stephen M. Cameron 5afe278114 cciss: do soft reset if hard reset is broken
on driver load, if reset_devices is set, and the hard reset
attempts fail, try to bring up the controller to the point that
a command can be sent, and send it a soft reset command, then
after the reset undo whatever driver initialization was done to get
it to the point to take a command, and re-do it after the reset.

This is to get kdump to work on all the "non-resettable" controllers
(except 64xx controllers which can't be reset due to the potentially
shared cache module.)

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:56 -06:00
Stephen M. Cameron bf2e2e6b87 cciss: use new doorbell-bit-5 reset method
The bit-2-doorbell reset method seemed to cause (survivable) NMIs
on some systems and (unsurvivable) IOCK NMIs on some G7 servers.
Firmware guys implemented a new doorbell method to alleviate these
problems triggered by bit 5 of the doorbell register.  We want to
use it if it's available.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:55 -06:00
Stephen M. Cameron 3e28601fdf cciss: increase timeouts for post-reset no-ops
Just to reduce the messages about timeouts that appear.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:54 -06:00
Stephen M. Cameron 59ec86bb98 cciss: clarify messages around reset behavior
When waiting for the board to become "not ready"
don't print a message saying "waiting for board to
become ready" (possibly followed by a message saying
"failed waiting for board to become not ready".  Instead,
it should be "waiting for board to reset" and "failed
waiting for board to reset."

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
"
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:53 -06:00
Stephen M. Cameron 19adbb9254 cciss: increase time to wait for board reset to start
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:51 -06:00
Stephen M. Cameron 8f71bb829a cciss: get rid of message related magic numbers
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:50 -06:00
Stephen M. Cameron e363e01436 cciss: fix reply pool and block fetch table memory leaks
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:50 -06:00
Stephen M. Cameron 2b48085f97 cciss: factor out irq request code
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:49 -06:00
Stephen M. Cameron abf7966e61 cciss: factor out scatterlist allocation functions
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:48 -06:00
Stephen M. Cameron 54dae34320 cciss: factor out command pool allocation functions
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:47 -06:00
Stephen M. Cameron 62710ae1ce cciss: do a better job of detecting controller reset failure
Detect failure of controller reset by noticing if the 32 bytes of
"driver version" we store on the hardware in the config table
fail to get zeroed out.  Previously we noticed if the controller
did not transition to "simple mode", but this did not detect reset
failure if the controller was already in simple mode prior to
the reset attempt (e.g. due to module parameter hpsa_simple_mode=1).

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:46 -06:00
Stephen M. Cameron 9bd3c20487 cciss: add readl after writel in interrupt mask setting code
This is to ensure the board interrupts are really off when
these functions return.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06 08:23:45 -06:00
Konrad Rzeszutek Wilk 3d68b39926 xen/blkback: Fix up some of the comments.
They had the wrong data or were in the wrong spot.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-05 13:43:26 -04:00
Konrad Rzeszutek Wilk fc53bf757e xen/blkback: Squash the checking for operation into dispatch_rw_block_io
We do a check for the operations right before calling dispatch_rw_block_io.
And then we do the same check in dispatch_rw_block_io. This patch
squashes those checks into the 'dispatch_rw_block_io' function.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-05 13:43:25 -04:00
Konrad Rzeszutek Wilk 24f567f952 xen/blkback: Add support for BLKIF_OP_FLUSH_DISKCACHE and drop BLKIF_OP_WRITE_BARRIER.
We drop the support for 'feature-barrier' and add in the support
for the 'feature-flush-cache' if the real backend storage supports
flushing.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-05 13:43:24 -04:00
Sage Weil 4ad12621e4 libceph: fix ceph_osdc_alloc_request error checks
ceph_osdc_alloc_request returns NULL on failure.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-03 09:28:13 -07:00
Konrad Rzeszutek Wilk a19be5f0f0 Revert "xen/blkback: Move the plugging/unplugging to a higher level."
This reverts commit 97961ef46b b/c
we lose about 15% performance if we do the unplugging and the
end of the reading the ring buffer.
2011-04-27 12:40:11 -04:00
Konrad Rzeszutek Wilk 013c3ca184 xen/blkback: Stick REQ_SYNC on WRITEs to deal with CFQ I/O scheduler.
If one runs a simple fio request with random read/write with a
20%/80% ratio, the numbers are incredibly bad when using the CFQ scheduler.

IOmeter       |       |      |          |
64K, randrw   |  NOOP | CFQ  | deadline |
randrwmix=80  |       |      |          |
--------------+-------+------+----------+
blkback       |103/27 |32/10 | 102/27   |
--------------+-------+------+----------+
QEMU qdisk    |103/27 |102/27| 102/27   |

The problem as explained by Vivek Goyal was:

".. that difference is that sync vs async requests. In the case of
a kernel thread submitting IO, [..] all the WRITES might be being
considered as async and will go in a different queue. If you mix those
with some READS, they are always sync and will go in differnet queue.
In presence of sync queue, CFQ will idle and choke up WRITES in
an attempt to improve latencies of READs.

In case of AIO [note: this is what QEMU qdisk is doing] , [..]
it is direct IO and both READS and WRITES will be considered SYNC
and will go in a single queue and no choking of WRITES will take place."

The solution is quite simple, tack on REQ_SYNC (which is
what the WRITE_ODIRECT macro points to) and the numbers go
back up.

Suggested-by: Vivek Goyal <vgoyal@redhat.com
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-26 16:24:18 -04:00
Konrad Rzeszutek Wilk 97961ef46b xen/blkback: Move the plugging/unplugging to a higher level.
We used to the plug/unplug on the submit_bio. But that means
if within a stream of WRITE, WRITE, WRITE,...,WRITE we have
one READ, it could stall the pipeline (as the 'submio_bio'
could trigger the unplug_fnc to be called and stall/sync
when doing the READ). Instead we want to move the unplugging
when the whole (or as a much as possible) ring buffer has been
processed. This also eliminates us doing plug/unplug for
each request.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-26 13:01:32 -04:00
Tejun Heo 9fd097b149 block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
In-kernel disk event polling doesn't matter for legacy/fringe drivers
and may lead to infinite event loop if ->check_events() implementation
generates events on level condition instead of edge.

Now that block layer supports suppressing exporting unlisted events,
simply leaving disk->events cleared allows these drivers to keep the
internal revalidation behavior intact while avoiding weird
interactions with userland event handler.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-21 21:33:05 +02:00
Tejun Heo d4dc210f69 block: don't block events on excl write for non-optical devices
Disk event code automatically blocks events on excl write.  This is
primarily to avoid issuing polling commands while burning is in
progress.  This behavior doesn't fit other types of devices with
removeable media where polling commands don't have adverse side
effects and door locking usually doesn't exist.

This patch introduces new genhd flag which controls the auto-blocking
behavior and uses it to enable auto-blocking only on optical devices.

Note for stable: 2.6.38 and later only

Cc: stable@kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-21 20:54:46 +02:00
Konrad Rzeszutek Wilk 8b6bf747d7 xen/blkback: Prefix exposed functions with xen_
And also shorten the name if it has blkback to blkbk.

This results in the symbol table (if compiled in the kernel)
to be much shorter, prettier,  and also easier to search for.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-20 11:58:03 -04:00
Konrad Rzeszutek Wilk 42c7841d17 xen-blkback: Inline some of the functions that were moved from vbd/interface.c
Shuffling code around.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-20 11:58:02 -04:00
Konrad Rzeszutek Wilk 6cd0388cd6 xen-blkback: Remove from the copyright notice the address.
There is no need for it, as the address is updated constatly
in the root of the Linux kernel.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-20 11:58:01 -04:00
Konrad Rzeszutek Wilk ee9ff8537e xen/blkback: Squash vbd.c,interface.c in blkback.c and xenbus.c respectivly.
Daniel Stodden suggested to eliminate vbd.c and interface.c, inlining the
critical bits where they belong, respectively.

Leaving only blkback.c for the data- and xenbus.c for the control path.

Suggested-by:  Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-20 11:57:59 -04:00
Konrad Rzeszutek Wilk dfc07b13dc xen/blkback: Move it from drivers/xen to drivers/block
.. and modify the Makefile and Kconfig files appropriately.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-18 14:30:26 -04:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Torvalds 7e599e6e62 drbd: fix up merge error
In commit 95a0f10cdd ("drbd: store in-core bitmap little endian,
regardless of architecture") drbd had made the sane choice to use
little-endian bitmap functions everywhere.  However, it used the
horrible old functions names from <asm-generic/bitops/le.h>, that were
never really meant to be exported.

In the meantime, things got cleaned up, and in commit c4945b9ed4
("asm-generic: rename generic little-endian bitops functions") we
renamed the LE bitops to something sane, exactly so that they could be
used in random code without people gouging their eyes out when seeing
the crazy jumble of letters that were the old internal names.

As a result the drbd thing merged cleanly (commit 8d49a77568d1: "Merge
branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block"),
since there was no data conflict - but the end result obviously doesn't
actually compile.

Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-28 07:42:58 -07:00
Linus Torvalds 8d49a77568 Merge branch 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/drivers' of git://git.kernel.dk/linux-2.6-block: (122 commits)
  cciss: fix lost command issue
  drbd: need include for bitops functions declarations
  Revert "cciss: Add missing allocation in scsi_cmd_stack_setup and  corresponding deallocation"
  cciss: fix missed command status value CMD_UNABORTABLE
  cciss: remove unnecessary casts
  cciss: Mask off error bits of c->busaddr in cmd_special_free when calling pci_free_consistent
  cciss: Inform controller we are using 32-bit tags.
  cciss: hoist tag masking out of loop
  cciss: Add missing allocation in scsi_cmd_stack_setup and  corresponding deallocation
  cciss: export resettable host attribute
  drbd: drop code present under #ifdef which is relevant to 2.6.28 and below
  drbd: Fixed handling of read errors on a 'VerifyS' node
  drbd: Fixed handling of read errors on a 'VerifyT' node
  drbd: Implemented real timeout checking for request processing time
  drbd: Remove unused function atodb_endio()
  drbd: improve log message if received sector offset exceeds local capacity
  drbd: kill dead code
  drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails
  drbd: Removed left over, now wrong comments
  drbd: serialize admin requests for new verify run with pending bitmap io
  ...
2011-03-27 20:02:07 -07:00
Linus Torvalds 6c51038900 Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
  Documentation/iostats.txt: bit-size reference etc.
  cfq-iosched: removing unnecessary think time checking
  cfq-iosched: Don't clear queue stats when preempt.
  blk-throttle: Reset group slice when limits are changed
  blk-cgroup: Only give unaccounted_time under debug
  cfq-iosched: Don't set active queue in preempt
  block: fix non-atomic access to genhd inflight structures
  block: attempt to merge with existing requests on plug flush
  block: NULL dereference on error path in __blkdev_get()
  cfq-iosched: Don't update group weights when on service tree
  fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
  block: Require subsystems to explicitly allocate bio_set integrity mempool
  jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  fs: make fsync_buffers_list() plug
  mm: make generic_writepages() use plugging
  blk-cgroup: Add unaccounted time to timeslice_used.
  block: fixup plugging stubs for !CONFIG_BLOCK
  block: remove obsolete comments for blkdev_issue_zeroout.
  blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
  ...

Fix up conflicts in fs/{aio.c,super.c}
2011-03-24 10:16:26 -07:00
Bud Brown 1ddd504954 cciss: fix lost command issue
Under certain workloads a command may seem to get lost. IOW, the Smart Array
thinks all commands have been completed but we still have commands in our
completion queue. This may lead to system instability, filesystems going
read-only, or even panics depending on the affected filesystem. We add an
extra read to force the write to complete.

Testing shows this extra read avoids the problem.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-23 20:47:11 +01:00
Linus Torvalds 0adfc56ce8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: use watch/notify for changes in rbd header
  libceph: add lingering request and watch/notify event framework
  rbd: update email address in Documentation
  ceph: rename dentry_release -> d_release, fix comment
  ceph: add request to the tail of unsafe write list
  ceph: remove request from unsafe list if it is canceled/timed out
  ceph: move readahead default to fs/ceph from libceph
  ceph: add ino32 mount option
  ceph: update common header files
  ceph: remove debugfs debug cruft
  libceph: fix osd request queuing on osdmap updates
  ceph: preserve I_COMPLETE across rename
  libceph: Fix base64-decoding when input ends in newline.
2011-03-22 16:25:25 -07:00
Yehuda Sadeh 59c2be1e4d rbd: use watch/notify for changes in rbd header
Send notifications when we change the rbd header (e.g. create a snapshot)
and wait for such notifications.  This allows synchronizing the snapshot
creation between different rbd clients/rools.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-22 11:33:56 -07:00
Linus Torvalds e16b396ce3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
  doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
  Update cpuset info & webiste for cgroups
  dcdbas: force SMI to happen when expected
  arch/arm/Kconfig: remove one to many l's in the word.
  asm-generic/user.h: Fix spelling in comment
  drm: fix printk typo 'sracth'
  Remove one to many n's in a word
  Documentation/filesystems/romfs.txt: fixing link to genromfs
  drivers:scsi Change printk typo initate -> initiate
  serial, pch uart: Remove duplicate inclusion of linux/pci.h header
  fs/eventpoll.c: fix spelling
  mm: Fix out-of-date comments which refers non-existent functions
  drm: Fix printk typo 'failled'
  coh901318.c: Change initate to initiate.
  mbox-db5500.c Change initate to initiate.
  edac: correct i82975x error-info reported
  edac: correct i82975x mci initialisation
  edac: correct commented info
  fs: update comments to point correct document
  target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
  ...

Trivial conflict in fs/eventpoll.c (spelling vs addition)
2011-03-18 10:37:40 -07:00
Stephen Rothwell f0ff1357ce drbd: need include for bitops functions declarations
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-17 15:02:51 +01:00
Linus Torvalds dc113c1f1d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k/block: amiflop - Remove superfluous amiga_chip_alloc() cast
  m68k/atari: ARAnyM - Add support for network access
  m68k/atari: ARAnyM - Add support for console access
  m68k/atari: ARAnyM - Add support for block access
  m68k/atari: Initial ARAnyM support
  m68k: Kconfig - Remove unneeded "default n"
  m68k: Makefiles - Change to new flags variables
  m68k/amiga: Reclaim Chip RAM for PPC exception handlers
  m68k: Allow all kernel traps to be handled via exception fixups
  m68k: Use base_trap_init() to initialize vectors
  m68k: Add helper function handle_kernel_fault()
2011-03-16 19:08:03 -07:00
Linus Torvalds 4c5811bf46 Merge branch 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6: (21 commits)
  tty: serial: altera_jtaguart: Add device tree support
  tty: serial: altera_uart: Add devicetree support
  dt: eliminate of_platform_driver shim code
  dt: Eliminate of_platform_{,un}register_driver
  dt/serial: Eliminate users of of_platform_{,un}register_driver
  dt/usb: Eliminate users of of_platform_{,un}register_driver
  dt/video: Eliminate users of of_platform_{,un}register_driver
  dt/net: Eliminate users of of_platform_{,un}register_driver
  dt/sound: Eliminate users of of_platform_{,un}register_driver
  dt/spi: Eliminate users of of_platform_{,un}register_driver
  dt: uartlite: merge platform and of_platform driver bindings
  dt: xilinx_hwicap: merge platform and of_platform driver bindings
  ipmi: convert OF driver to platform driver
  leds/leds-gpio: merge platform_driver with of_platform_driver
  dt/sparc: Eliminate users of of_platform_{,un}register_driver
  dt/powerpc: Eliminate users of of_platform_{,un}register_driver
  dt/powerpc: move of_bus_type infrastructure to ibmebus
  drivercore/dt: add a match table pointer to struct device
  dt: Typo fix.
  altera_ps2: Add devicetree support
  ...
2011-03-16 17:28:10 -07:00
Linus Torvalds 7a6362800c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
  bonding: enable netpoll without checking link status
  xfrm: Refcount destination entry on xfrm_lookup
  net: introduce rx_handler results and logic around that
  bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
  bonding: wrap slave state work
  net: get rid of multiple bond-related netdevice->priv_flags
  bonding: register slave pointer for rx_handler
  be2net: Bump up the version number
  be2net: Copyright notice change. Update to Emulex instead of ServerEngines
  e1000e: fix kconfig for crc32 dependency
  netfilter ebtables: fix xt_AUDIT to work with ebtables
  xen network backend driver
  bonding: Improve syslog message at device creation time
  bonding: Call netif_carrier_off after register_netdevice
  bonding: Incorrect TX queue offset
  net_sched: fix ip_tos2prio
  xfrm: fix __xfrm_route_forward()
  be2net: Fix UDP packet detected status in RX compl
  Phonet: fix aligned-mode pipe socket buffer header reserve
  netxen: support for GbE port settings
  ...

Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
2011-03-16 16:29:25 -07:00
Geert Uytterhoeven 059718d572 m68k/block: amiflop - Remove superfluous amiga_chip_alloc() cast
amiga_chip_alloc() returns a void *, so we don't need a cast.
Also clean up coding style while we're at it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2011-03-16 19:11:25 +01:00
Linus Torvalds 76ca078328 Merge branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
  xen: suspend: remove xen_hvm_suspend
  xen: suspend: pull pre/post suspend hooks out into suspend_info
  xen: suspend: move arch specific pre/post suspend hooks into generic hooks
  xen: suspend: refactor non-arch specific pre/post suspend hooks
  xen: suspend: add "arch" to pre/post suspend hooks
  xen: suspend: pass extra hypercall argument via suspend_info struct
  xen: suspend: refactor cancellation flag into a structure
  xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding
  xen: switch to new schedop hypercall by default.
  xen: use new schedop interface for suspend
  xen: do not respond to unknown xenstore control requests
  xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled
  xen: PV on HVM: support PV spinlocks and IPIs
  xen: make the ballon driver work for hvm domains
  xen-blkfront: handle Xen major numbers other than XENVBD
  xen: do not use xen_info on HVM, set pv_info name to "Xen HVM"
  xen: no need to delay xen_setup_shutdown_event for hvm guests anymore
2011-03-15 10:59:09 -07:00
Linus Torvalds 27d2a8b97e Merge branches 'stable/ia64', 'stable/blkfront-cleanup' and 'stable/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/ia64' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: ia64 build broken due to "xen: switch to new schedop hypercall by default."

* 'stable/blkfront-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: Union the blkif_request request specific fields

* 'stable/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: annotate functions which only call into __init at start of day
  xen p2m: annotate variable which appears unused
  xen: events: mark cpu_evtchn_mask_p as __refdata
2011-03-15 10:49:16 -07:00
Jens Axboe b66538014f Revert "cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation"
This reverts commit 978eb516a4.

The commit was broken, relying on other changes that have not been
committed yet.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-12 13:47:51 +01:00
Stephen M. Cameron 6d9a4f9e21 cciss: fix missed command status value CMD_UNABORTABLE
and fix a nearby typo, "do" that should have been "due"

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-12 10:02:30 +01:00
Stephen M. Cameron fcab1c112a cciss: remove unnecessary casts
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-12 10:02:24 +01:00
Stephen M. Cameron 16011131ce cciss: Mask off error bits of c->busaddr in cmd_special_free when calling pci_free_consistent
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-12 10:02:21 +01:00
Stephen M. Cameron 0498cc2a9e cciss: Inform controller we are using 32-bit tags.
Controller will DMA only 32-bits of the tag per command
on completion if it knows we are only using 32-bit tags.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-12 10:02:16 +01:00
Stephen M. Cameron 4a76504655 cciss: hoist tag masking out of loop
In process_nonindexed_cmd, hoist figuring of masked tag out of loop since
it is the same throughout.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-12 10:02:11 +01:00
Stephen M. Cameron 978eb516a4 cciss: Add missing allocation in scsi_cmd_stack_setup and corresponding deallocation
This bit got lost somewhere along the way.  Without this, panic.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-11 20:07:38 +01:00
Stephen M. Cameron 957c2ec558 cciss: export resettable host attribute
This attribute, requested by Redhat, allows kexec-tools to know
whether the controller can honor the reset_devices kernel parameter
and actually reset the controller.  For kdump to work properly it
is necessary that the reset_devices parameter be honored.  This
attribute enables kexec-tools to warn the user if they attempt to
designate a non-resettable controller as the dump device.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-11 20:06:09 +01:00
Or Gerlitz 03567812d8 drbd: drop code present under #ifdef which is relevant to 2.6.28 and below
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:21 +01:00
Philipp Reisner 7961243b7b drbd: Fixed handling of read errors on a 'VerifyS' node
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:20 +01:00
Philipp Reisner 8f21420ebd drbd: Fixed handling of read errors on a 'VerifyT' node
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:18 +01:00
Philipp Reisner 7fde2be930 drbd: Implemented real timeout checking for request processing time
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:16 +01:00
Andreas Gruenbacher c5a9161979 drbd: Remove unused function atodb_endio()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:15 +01:00
Lars Ellenberg fdda6544ad drbd: improve log message if received sector offset exceeds local capacity
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:13 +01:00
Lars Ellenberg e99dc367b3 drbd: kill dead code
This code became obsolete and unused last December with
 drbd: bitmap keep track of changes vs on-disk bitmap

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:12 +01:00
Lars Ellenberg 10f6d9926c drbd: don't BUG_ON, if bio_add_page of a single page to an empty bio fails
Just deal with it more gracefully, if we fail to add even a single page
to an empty bio. We used to BUG_ON() there, but it has been observed in
some Xen deployment, so we need to handle that case more robustly now.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:10 +01:00
Philipp Reisner 039312b648 drbd: Removed left over, now wrong comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:09 +01:00
Lars Ellenberg 873b0d5f98 drbd: serialize admin requests for new verify run with pending bitmap io
This is an addendum to
 drbd: serialize admin requests for new resync with pending bitmap io

It avoids a race that could trigger "FIXME" assert log messages.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:07 +01:00
Lars Ellenberg e636db5b95 drbd: fix potential imbalance of ap_in_flight
When we receive a barrier ack, we walk the ring list of drbd requests
in the transfer log of the respective epoch, do some housekeeping,
and free those objects.

We tried to keep epochs of mirrored and unmirrored drbd requests
separate, and assert that no local-only requests are present in a
barrier_acked epoch.

It turns out that this has quite a number of corner cases and would
add bloated code without functional benefit.

We now revert the (insufficient) commits
 drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions
 drbd: Ensure that an epoch contains only requests of one kind
and instead fix the processing of barrier acks to cope with
a mix of local-only and mirrored requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:06 +01:00
Lars Ellenberg 0ddc5549f8 drbd: silence some noisy log messages during disconnect
If we fail to send the information that we lost our disk,
we have no connection, and no disk: no access to data anymore.
That is either expected (deconfiguration), or there will be so much
noise in the logs that "Sending state failed" is not useful at all.
Drop it.

If the reason for a shorter than expected receive was a signal,
which we sent because we already decided to disconnect,
these additional log messages are confusing and useless.

This patch follows this pattern:
 - dev_warn(DEV, "short read expecting header on sock: r=%d\n", r);
 + if (!signal_pending(current))
 + 	dev_warn(DEV, "short read expecting header on sock: r=%d\n", r);

Also make them all dev_warn for consistency.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:04 +01:00
Lars Ellenberg 20ceb2b22e drbd: describe bitmap locking for bulk operation in finer detail
Now that we do no longer in-place endian-swap the bitmap, we allow
selected bitmap operations (testing bits, sometimes even settting bits)
during some bulk operations.

This caused us to hit a lot of FIXME asserts similar to
	FIXME asender in drbd_bm_count_bits,
	bitmap locked for 'write from resync_finished' by worker
Which now is nonsense: looking at the bitmap is perfectly legal
as long as it is not being resized.

This cosmetic patch defines some flags to describe expectations in finer
detail, so the asserts in e.g. bm_change_bits_to() can be skipped if
appropriate.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:02 +01:00
Lars Ellenberg 62b0da3a24 drbd: log UUIDs whenever they change
All decisions about sync, sync direction, and wether or not to
allow a connect or attach are based on our set of UUIDs to tag a
data generation.

Log changes to the UUIDs whenever they occur,
logging "new current UUID P:Q:R:S" is more useful
than "Creating new current UUID".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:01 +01:00
Philipp Reisner d07c9c10e5 drbd: We can not process BIOs with a size of 0
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:47:59 +01:00
Philipp Reisner cd88d030d4 drbd: Provide hints with the error message when clearing the sync pause flag
When the user clears the sync-pause flag, and sync stays in pause
state, give hints to the user, why it still is in pause state.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:47:58 +01:00
Lars Ellenberg 79a30d2d71 drbd: queue bitmap writeout more intelligently
The "lazy writeout" of cleared bitmap pages happens during resync, and
should happen again once the resync finishes cleanly, or is aborted.

If resync finished cleanly, or was aborted because of peer disk
failure, we trigger the writeout from worker context in the after
state change work.

If resync was aborted because of connection failure, we should not
immediately trigger bitmap writeout, but rather postpone the
writeout to after the connection cleanup happened.  We now do it
in the receiver context from drbd_disconnect().

If resync was aborted because of local disk failure, well, there
is nothing to write to anymore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:47:56 +01:00
Lars Ellenberg 54b956abef drbd: don't pointlessly queue bitmap send, if we lost connection
This is a minor optimization and cleanup,
and also considerably reduces some harmless (but noisy) race with
the connection cleanup code.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:47:55 +01:00
Lars Ellenberg 194bfb32db drbd: serialize admin requests for new resync with pending bitmap io
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:47:53 +01:00
Lars Ellenberg 6c922ed543 drbd: only generate and send a new sync uuid after a successful state change
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:47:52 +01:00
Philipp Reisner 20ee639024 drbd: cleaned up __set_current_state() followed by schedule_timeout() calls
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:47:42 +01:00
Philipp Reisner 6a35c45f89 drbd: Ensure that an epoch contains only requests of one kind
The assert in drbd_req.c:755 forces us to have only requests of
one kind in an epoch. The two kinds we distinguish here are:
local-only or mirrored.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:42 +01:00
Philipp Reisner 2deb8336d0 drbd: Fixed P_NEG_ACK processing for protocol A and B
Protocol A has no P_WRITE_ACKs, but has P_NEG_ACKs.
The master bio might already be completed, therefore the
request is no longer in the collision hash.
=> Do not try to validate block_id as request

In Protocol B we might already have got a P_RECV_ACK
but then get a P_NEG_ACK after wards.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:40 +01:00
Philipp Reisner 94f2b05f03 drbd: Killed an assert that is no longer valid
The point is that drbd_disconnect() can be called with a cstate of
WFConnection.

That happens if the user issues "drbdsetup disconnect" while the
drbd_connect() function executes. Then drbdd_init() will call
drbdd(), which in turn will return without receiving any
packets. Then drbdd_init() will end up calling drbd_disconnect()
with a cstate of WFConnection.

Bottom line: This assertion is wrong as it is, and we do not
see value in fixing it. => Removing it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:39 +01:00
Philipp Reisner 148efa165e drbd: Do not drop net config if sending in drbd_send_protocol() fails
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:37 +01:00
Philipp Reisner 370a43e798 drbd: Work on the Ahead -> SyncSource transition
The test if rs_pending_cnt == 0 was too weak. Using Test for
unacked_cnt == 0 instead. Moved that into the worker.

Since unacked_cnt gets already increased when an P_RS_DATA_REQ
comes in.

Also using a timer to make Ahead -> SyncSource -> Ahead cycles
slower...

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:36 +01:00
Philipp Reisner 71c78cfba2 drbd: Nothing should stop SyncSource -> Ahead transitions
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:34 +01:00
Philipp Reisner 4a23f26496 drbd: Do not full sync if a P_SYNC_UUID packet gets lost
See also commit from 2009-08-15
"drbd_uuid_compare(): Do not full sync in case a P_SYNC_UUID packet gets lost."

We saw cases where the History UUIDs where not as expected. So the
detection of the special case did not trigger. With the sync UUID
no longer being a random number, but deducible from the previous
bitmap UUID, the detection of this special case becomes more
reliable.

The SyncUUID now is the previous bitmap UUID + 0x1000000000000.

Rule 5a:
Cs = H1p & H1p + Offset = Bp
  Connection was lost before SyncUUID Packet came through.
  Corrent (peer) UUIDs:
   Bp = H1p
   H1p = H2p
   H2p = 0
  Become Sync target.

Rule 7a:
Cp = H1s & H1s + Offset = Bs
  Connection was lost before SyncUUID Packet came through.
  Correct (own) UUIDs:
   Bs = H1s
   H1s = H2s
   H2s = 0
  Become Sync source.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:32 +01:00
Philipp Reisner 2b8a90b555 drbd: Corrected off-by-one error in DRBD_MINOR_COUNT_MAX
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:31 +01:00
Andreas Gruenbacher 110a204a35 drbd: Remove useless / wrong comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:29 +01:00
Philipp Reisner 794abb753e drbd: Cleaned up the resync timer logic
Besides removed a few lines of code, this moves the inspection
of the state from before the queuing process to after the queuing.
I.e. more closely to the actual invocation of the work.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:28 +01:00
Philipp Reisner da0a78161d drbd: Be more careful with SyncSource -> Ahead transitions
We may not get from SyncSource to Ahead if we have sent some
P_RS_DATA_REPLY packets to the peer and are waiting for
P_WRITE_ACK.

Again, this is not relevant for proper tuned systems, but makes
sure that the not-tuned system does not get diverging bitmaps.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:26 +01:00
Philipp Reisner d612d309e4 drbd: No longer answer P_RS_DATA_REQUEST packets when in C_AHEAD mode
When the sync source node replies to a P_RS_DATA_REQUEST packet
when it is already in ahead mode. I.e. those two packets
crossed each other on the wire, that may lead to diverging
bitmaps.

  This never happens in a well-tuned-system. In a well-tuned-
  system the resync controller has reduced the resync speed
  to zero long before we got into ahead-mode.

But we have to be prepared for the not-well-tuned-system
of course as well.
Because -> diverging bitmaps = non terminating resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:25 +01:00
Philipp Reisner 617049aa7d drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions
Create a new barrier when leaving the AHEAD mode.

  Otherwise we trigger the assertion in req_mod(, barrier_acked)
  D_ASSERT(req->rq_state & RQ_NET_SENT);

The new barrier is created by recycling the newest existing one.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:23 +01:00
Lars Ellenberg 0719427278 drbd: ratelimit io error messages
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:21 +01:00
Philipp Reisner 3f98688afc drbd: There might be a resync after unfreezing IO due to no disk [Bugz 332]
When on-no-data-accessible is set to suspend-io, also consider that
a Primary, SyncTarget node losses its connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:20 +01:00
Lars Ellenberg 725a97e43e drbd: fix potential access of on-stack wait_queue_head_t after return
I run into something declaring itself as "spinlock deadlock",
 BUG: spinlock lockup on CPU#1, kjournald/27816, ffff88000ad6bca0
 Pid: 27816, comm: kjournald Tainted: G        W 2.6.34.6 #2
 Call Trace:
  <IRQ>  [<ffffffff811ba0aa>] do_raw_spin_lock+0x11e/0x14d
  [<ffffffff81340fde>] _raw_spin_lock_irqsave+0x6a/0x81
  [<ffffffff8103b694>] ? __wake_up+0x22/0x50
  [<ffffffff8103b694>] __wake_up+0x22/0x50
  [<ffffffffa07ff661>] bm_async_io_complete+0x258/0x299 [drbd]
but the call traces do not fit at all,
all other cpus are cpu_idle.

I think it may be this race:

drbd_bm_write_page
 wait_queue_head_t io_wait;
 atomic_t in_flight;
 bm_async_io
  submit_bio
					bm_async_io_complete
					  if (atomic_dec_and_test(in_flight))
 wait_event(io_wait,
	atomic_read(in_flight) == 0)
 return
					    wake_up(io_wait)

The wake_up now accesses the wait_queue_head_t spinlock, which is no
longer valid, since the stack frame of drbd_bm_write_page has been
clobbered now.

Fix this by using struct completion, which does both the condition test
as well as the wake_up inside its spinlock, so this race cannot happen.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:08 +01:00
Lars Ellenberg 06d33e968d drbd: improve on bitmap write out timing
Even though we now track the need for bitmap writeout per bitmap page,
there is no need to trigger the writeout while a resync is going on.

Once the resync is finished (or aborted),
we trigger bitmap writeout anyways.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:40 +01:00
Lars Ellenberg 418e0a927d drbd: spelling fix in log message
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:38 +01:00
Lars Ellenberg 7648cdfe52 drbd: be less noisy with some log messages
We expect changes to a bitmap page in drbd_bm_write_page,
that's why we submit a copy page.

If a page changes during global writeout, that would be unexpected,
and reason to warn, though.

Also, often page writeout can be skipped (on activity log transactions
during normal operation, for example), no need to log that everytime.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:37 +01:00
Lars Ellenberg 5a22db8968 drbd: serialize sending of resync uuid with pending w_send_oos
To improve the latency of IO requests during bitmap exchange,
we recently allowed writes while waiting for the bitmap, sending "set
out-of-sync" information packets for any newly dirtied bits.

We have to make sure that the new resync-uuid does not overtake
these "set oos" packets. Once the resync-uuid is received, the
sync target starts the resync process, and expects the bitmap to
only be cleared, not re-set.

If we use this protocol extension, we queue the generation and sending
of the resync-uuid on the worker, which naturally serializes with all
previously queued packets.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:35 +01:00
Lars Ellenberg f735e36354 drbd: add debugging assert to make sure the protocol is clean
We expect to only receive the recently introduced "set out of sync"
packets in specific states. If we receive them in different states, that
may confuse the resync process to the point where it won't terminate, or
think it made negative progress.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:34 +01:00
Philipp Reisner c88d65e223 drbd: Documenting drbd_should_do_remote() and drbd_should_send_oos()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:32 +01:00
Lars Ellenberg 2265b473ae drbd: fix potential dereference of NULL pointer
If drbd used to have crypto digest algorithms configured, then is being
unconfigured (but not unloaded), it frees the algorithms, but does not
reset the config.  If it then is reconfigured to use the very same
algorithm, it "forgot" to re-allocate the algorithms, thinking that the
config has not changed in that aspect.
It will then Oops on the first attempt to actually use those algorithms.

Fix this by resetting the config to defaults after cleanup.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:30 +01:00
Lars Ellenberg 02851e9f00 drbd: move bitmap write from resync_finished to after_state_change
We must not call it directly from resync_finished,
as we may be in either receiver or worker context there.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:29 +01:00
Lars Ellenberg 84e7c0f7d1 drbd: Removed a reference to debug macros removed long time ago
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:27 +01:00
Lars Ellenberg 6850c44214 drbd: get rid of unused debug code
Long time ago, we had paranoia code in the bitmap that allocated one
extra word, assigned a magic value, and checked on every occasion that
the magic value was still unchanged.

That debug code is unused, the extra long word complicates code a bit.
Get rid of it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:26 +01:00
Lars Ellenberg 4b0715f096 drbd: allow petabyte storage on 64bit arch
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:24 +01:00
Lars Ellenberg 19f843aa08 drbd: bitmap keep track of changes vs on-disk bitmap
When we set or clear bits in a bitmap page,
also set a flag in the page->private pointer.

This allows us to skip writes of unchanged pages.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:19 +01:00
Lars Ellenberg 95a0f10cdd drbd: store in-core bitmap little endian, regardless of architecture
Our on-disk bitmap is a little endian bitstream.
Up to now, we have stored the in-core copy of that in
native endian, applying byte order conversion when necessary.

Instead, keep the bitmap pages little endian, as they are read from disk,
and use the generic_*_le_bit family of functions.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:40 +01:00
Lars Ellenberg 7777a8ba1f drbd: bitmap: don't count unused bits (fix non-terminating resync)
We trusted the on-disk bitmap to have unused bits cleared.
In case that is not true for whatever reason,
and we take a code path where the unused bits don't get cleared
elsewhere (bm_clear_surplus is not called), we may miscount the bits,
and get confused during resync, waiting for bits to get cleared that we
don't even use: the resync process would not terminate.

Fix this by masking out unused bits in __bm_count_bits.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:38 +01:00
Andreas Gruenbacher 1b881ef775 drbd: Rename __inc_ap_bio_cond to may_inc_ap_bio
The old name is confusing: the function does not increment anything.
Also rename _inc_ap_bio_cond to inc_ap_bio_cond: there is no need for
an underscore.
Finally, make it clear that these functions return boolean values.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:37 +01:00
Andreas Gruenbacher 24dccabb39 drbd: Fix: drbd_bitmap_io does not return an enum determine_dev_size
I guess bitmap I/O errors are supposed to cause drbd_determin_dev_size
to return dev_size_error.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:35 +01:00
Andreas Gruenbacher 2c46407d24 drbd: receive_bitmap_plain: Get rid of ugly and useless enum
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:34 +01:00
Andreas Gruenbacher f70af118e3 drbd: send_bitmap_rle_or_plain: Get rid of ugly and useless enum
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:32 +01:00
Andreas Gruenbacher 78fcbdae22 drbd: receive_bitmap: Missing free_page() on error path
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:30 +01:00
Andreas Gruenbacher de1f8e4a0a drbd: receive_bitmap: Avoid casting enum drbd_state_rv to int
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:29 +01:00
Andreas Gruenbacher 4114be815f drbd: receive_bitmap: Fix the wrong return value
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:27 +01:00
Andreas Gruenbacher f2024e7ce2 drbd: drbd_nl_disk_conf: Avoid a compiler warning
Warning: comparison between ‘enum drbd_ret_code’ and ‘enum drbd_state_rv’

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:26 +01:00
Andreas Gruenbacher 81e84650c2 drbd: Use the standard bool, true, and false keywords
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:24 +01:00
Andreas Gruenbacher 6184ea2145 drbd: This code is dead now
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:22 +01:00
Andreas Gruenbacher bb4379464e drbd: Another small enum drbd_state_rv cleanup
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:21 +01:00
Andreas Gruenbacher bf885f8a67 drbd: Be more explicit about functions that return an enum drbd_state_rv
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:19 +01:00
Andreas Gruenbacher c8b325632f drbd: Rename enum drbd_state_ret_codes to enum drbd_state_rv
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:18 +01:00
Andreas Gruenbacher 116676ca62 drbd: Rename enum drbd_ret_codes to enum drbd_ret_code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:16 +01:00
Andreas Gruenbacher 0cf9d27e38 drbd: Get rid of unnecessary macros (2)
The FAULT_ACTIVE macro just wraps the drbd_insert_fault macro for no
apparent reason.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:15 +01:00
Andreas Gruenbacher 662d91a23a drbd: Get rid of unnecessary macros (1)
This macro doesn't save much code, but makes things a lot harder to read.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:13 +01:00
Andreas Gruenbacher 2f58dcfc85 drbd: Rename drbd_make_request_26 to drbd_make_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:11 +01:00
Andreas Gruenbacher 96756784a6 drbd: Remove left-over prototype
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:10 +01:00
Andreas Gruenbacher cab2f74b45 drbd: Make sure that drbd_send() has sent the right number of bytes
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-03-10 11:36:08 +01:00
Lars Ellenberg 220df4d006 drbd: fix incomplete error message
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:02 +01:00
Andreas Gruenbacher 7e458c32da drbd: Removed an unnecessary #undef
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:22 +01:00
Lars Ellenberg 8a3c104438 drbd: fix regression, we need to close drbd epochs during normal operation
commit e2041475e6ddb081734d161f6421977323f5a9b9
drbd: Starting with protocol 96 we can allow app-IO while receiving the bitmap

Contained a bad chunk that tried to optimize away drbd barriers during
bitmap exchange, but accidentally dropped them for normal mode as well.

Impact: depending on activity log size and access pattern, activity log
extents may not be recycled in time, causeing IO to block indefinetely.

Fix: skip drbd barriers only if there is no connection to send them on,
or the request being completed has not been on the network at all.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:20 +01:00
Philipp Reisner 09b9e79793 drbd: Implemented the before-resync-source handler
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:18 +01:00
Philipp Reisner 2561b9c1f1 drbd: --force option for disconnect
As the network connection can be lost at any time, a --force option
for disconnect is just a matter of completeness.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:17 +01:00
Lars Ellenberg 42ff269d10 drbd: add packet_type 27 (return_code_only) to netlink api
In case we ever should add an other packet type,
we must not reuse 27, as that currently used for
"empty" return code only replies.
Document it as such.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:15 +01:00
Lars Ellenberg 3e3a7766c2 drbd: use kzalloc and memset(,0,) to start with clean buffers in drbd_nl
Make sure we start with clean buffers to not accidentally send garbage
back to userspace. Note: has not been observed; but just in case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:14 +01:00
Lars Ellenberg 17a93f3007 drbd: remove /proc/drbd before unregistering from netlink
There still exists a (theoretical) race on module unload, where
/proc/drbd may still exist, but the netlink callback has been
unregistered already, allowing drbdsetup to shout without listeners,
and get no reply.

Reorder remove_proc_entry and unregister of netlink callback.
drbdsetup first checks for existence of the proc entry,
and if that is missing, won't even try to contact the module.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:12 +01:00
Lars Ellenberg 3da127fa88 drbd: increase module count on /proc/drbd access
If someone holds /proc/drbd open, previously rmmod would
"succeed" in starting the unload, but then block on remove_proc_entry,
leading to a situation where the lsmod does not show drbd anymore,
but /proc/drbd being still there (but no longer accessible).

I'd rather have rmmod fail up front in this case.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:11 +01:00
Philipp Reisner c507f46f26 drbd: Removed 20 seconds upper bound for side-stepping
Given low-enough network bandwidth combined with a IO
pattern that hammers onto a single RS-extent, side-stepping
might be necessary for much longer times.

Changed the code to print a single informal message after
20 seconds, but it keeps on stepping aside forever.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:09 +01:00
Philipp Reisner 1fc80cf378 drbd: Becoming sync target may not happen out of < C_WF_REPORT_PARAMS
This patch is acutally a necessary addendum to the patch
"fix for spurious full sync (becoming sync target looked like invalidate)"

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:07 +01:00
Philipp Reisner 3719094ec2 drbd: Starting with protocol 96 we can allow app-IO while receiving the bitmap
* C_STARTING_SYNC_S, C_STARTING_SYNC_T In these states the bitmap gets
  written to disk. Locking out of app-IO is done by using the
  drbd_queue_bitmap_io() and drbd_bitmap_io() functions these days.
  It is no longer necessary to lock out app-IO based on the connection
  state.
  App-IO that may come in after the BITMAP_IO flag got cleared before the
  state transition to C_SYNC_(SOURCE|TARGET) does not get mirrored, sets
  a bit in the local bitmap, that is already set, therefore changes nothing.

* C_WF_BITMAP_S In this state we send updates (P_OUT_OF_SYNC packets).
  With that we make sure they have the same number of bits when going
  into the C_SYNC_(SOURCE|TARGET) connection state.

* C_UNCONNECTED: The receiver starts, no need to lock out IO.

* C_DISCONNECTING: in drbd_disconnect() we had a wait_event()
  to wait until ap_bio_cnt reaches 0. Removed that.

* C_TIMEOUT, C_BROKEN_PIPE, C_NETWORK_FAILURE
  C_PROTOCOL_ERROR, C_TEAR_DOWN: Same as C_DISCONNECTING

* C_WF_REPORT_PARAMS: IO still possible since that is still
  like C_WF_CONNECTION.

And we do not need to send barriers in C_WF_BITMAP_S connection state.

Allow concurrent accesses to the bitmap when receiving the bitmap.
Everything gets ORed anyways.

A drbd_free_tl_hash() is in after_state_chg_work(). At that point
all the work items of the last connections must have been processed.

Introduced a call to drbd_free_tl_hash() into drbd_free_mdev()
for paranoia reasons.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:06 +01:00
Philipp Reisner ab17b68f45 drbd: Improvements in sanitize_state()
The relevant change is that the state change to C_FW_BITMAP_S should
implicitly change pdsk to C_CONSISTENT. (Think of it as C_OUTDATED, only
without the guarantee that the peer has the outdated written to its
meta data)

At that opportunity I restructured the switch statement so that it
gets evaluated every time. (Has declarative character)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:04 +01:00
Philipp Reisner 22afd7ee94 drbd: Fixed race condition in drbd_queue_bitmap_io
May only test for ap_bio_cnt == 0 under req_lock. It can increase
only under req_lock.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:03 +01:00
Philipp Reisner 8869d683b7 drbd: Fixed inc_ap_bio()
The condition must be checked after perpare_to_wait(). The old
implementaion could loose wakeup events. Never observed in real
life.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:01 +01:00
Philipp Reisner 127b317844 drbd: use test_and_set_bit() to decide if bm_io_work should be queued
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:59 +01:00
Philipp Reisner aeda1cd6a5 drbd: Begin to account BIO processing time before inc_ap_bio()
Since inc_ap_bio() might sleep already

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:57 +01:00
Philipp Reisner f91ab6282d drbd: Implemented side-stepping in drbd_res_begin_io()
Before:
  drbd_rs_begin_io() locked app-IO out of an RS extent, and
  waited then until all previous app-IO in that area finished.
  (But not only until the disk-IO was finished but until the
   barrier/epoch ack came in for that == round trip time latency ++)

After:
  As soon as a new app-IO waits wants to start new IO on that
  RS extent, drbd_rs_begin_io() steps aside (clearing the
  BME_NO_WRITES flag again). It retries after 100ms.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:56 +01:00
Philipp Reisner 9d77a5fee9 drbd: Make some functions static
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:54 +01:00
Philipp Reisner e3555d8545 drbd: Implemented priority inheritance for resync requests
We only issue resync requests if there is no significant application IO
going on. = Application IO has higher priority than resnyc IO.

If application IO can not be started because the resync process locked
an resync_lru entry, start the IO operations necessary to release the
lock ASAP.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:53 +01:00
Philipp Reisner 59817f4fab drbd: Do not cleanup resync LRU for the Ahead/Behind SyncSource/SyncTarget transitions
This one should be replaced with moving this cleanup to the
'right' position.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:51 +01:00
Philipp Reisner c4752ef128 drbd: When proxy's buffer drained off go into regular resync mode
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:49 +01:00
Philipp Reisner 73a01a18b9 drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNC
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:48 +01:00
Philipp Reisner 67531718d8 drbd: Implemented two new connection states Ahead/Behind
In this connection mode, the ahead node no longer replicates
application IO. The behind's disk becomes out dated.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:46 +01:00
Philipp Reisner 422028b1ca drbd: New configuration parameters for dealing with network congestion
net {
    on_congestion {block|pull-ahead|disconnect};
    congestion-fill {sectors};
    congestion-extents {al-extents};
}

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:45 +01:00
Philipp Reisner 759fbdfba6 drbd: Track the numbers of sectors in flight
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:43 +01:00
Lars Ellenberg 688593c5a8 drbd: Renamed write_flags_to_bio() to wire_flags_to_bio()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:34:32 +01:00
Lars Ellenberg 4896e8c1b8 drbd: restore compatibility with 32bit kernels
With commit
drbd: further converge progress display of resync and online-verify
accidentally an u64/u64 div was introduced, causing an unresolvable
symbol __udivdi3 to be reference. Actually for that division, 32bit are
still suficient for now, so we can revert to unsigned long instead.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:13 +01:00
Lars Ellenberg 1816a2b47a drbd: properly use max_hw_sectors to limit the our bio size
To ease tracking of bios in some hash tables, we want it to
not cross certain boundaries (128k, used to be 32k).
We limit the maximum bio size using queue parameters.

Historically some defines and variables we use there have been named
max_segment_size, which was misguided. Rename them to max_bio_size,
and use [blk_]queue_max_hw_sectors where appropriate.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:11 +01:00
Lars Ellenberg 3129b1b9ae drbd: debug: limit nelink-broadcast of request on digest mismatch to 32k
We used to be limited to 32k requests,
but have increased that limit to 128k now.

This part of the code can only deal with 32k,
it would scramble arbitrary pages for larger requests.

As it is used for debugging only anyways,
it is ok to simply truncate the dumped data here.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:09 +01:00
Lars Ellenberg 470be44ab1 drbd: detect modification of in-flight buffers
With data-integrity digest enabled, double-check on the sending side
for modifications by upper layers of buffers under write back,
so we can tell it appart from corruption on the "wire".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:08 +01:00
Lars Ellenberg 5f9915bbb8 drbd: further converge progress display of resync and online-verify
Show progressbar and ETA always, with proc_details >= 1 also show the
current sector position for both resync and online-verify on both nodes.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:06 +01:00
Lars Ellenberg 18edc0b9d7 drbd: fix potential wrap of 32bit oos:%lu display in /proc/drbd
When converting bits (4k resolution, still) to kB, we shift left.  If it
was a large number of bits on a 32bit box (>= 4 TiB storage), we may
wrap the 32bit unsigned long base type, resulting in incorrect display.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:04 +01:00
Lars Ellenberg 2649f0809f drbd: use the resync controller for online-verify requests as well
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:03 +01:00
Lars Ellenberg e65f440d47 drbd: factor out drbd_rs_number_requests
Preparation patch to be able to use the auto-throttling resync controller
for online-verify requests as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:19:01 +01:00
Lars Ellenberg 9bd28d3c90 drbd: factor out drbd_rs_controller_reset
Preparation patch to be able to use the auto-throttling resync controller
for online-verify requests as well.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:59 +01:00
Lars Ellenberg 439d595379 drbd: show progress bar and ETA for online-verify
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:58 +01:00
Lars Ellenberg ea5442aff6 drbd: advance progress step marks for online-verify
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:56 +01:00
Lars Ellenberg c6ea14dfa3 drbd: factor out advancement of resync marks for progress reporting
This is in preparation to unify progress reporting of
online-verify and resync requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:54 +01:00
Lars Ellenberg de228bba67 drbd: initialize online-verify progress tracking on verify target
For partial (resumed) online verify, initialize the resync step marks
once we know what the online verify start sector is.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:53 +01:00
Lars Ellenberg 30b743a2d5 drbd: improve online-verify progress tracking
For a partial (resumed) online-verify, initialize rs_total not to total
bits, but to number of bits to check in this run, to match the meaning
rs_total has for actual resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:51 +01:00
Lars Ellenberg 2652561886 drbd: only reset online-verify start sector if verify completed
For network hickups during online-verify, on the next verify
triggered, we by default want to resume where it left off.

After any replication link interruption, there will be a (possibly
empty) resync.  Do not reset online-verify start sector if some resync
completed, that would defeats the purpose.

Only reset the start sector once a verify run is completed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:18:49 +01:00
Jens Axboe 4c63f5646e Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core
Conflicts:
	block/blk-core.c
	block/blk-flush.c
	drivers/md/raid1.c
	drivers/md/raid10.c
	drivers/md/raid5.c
	fs/nilfs2/btnode.c
	fs/nilfs2/mdt.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:58:35 +01:00
Jens Axboe 721a9602e6 block: kill off REQ_UNPLUG
With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:27 +01:00
Jens Axboe 7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
Tejun Heo 3c0d206092 pktcdvd: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

pktcdvd needs to forward all event related operations to the
underlying device.  Forward ->check_events() instead of
->media_changed() and inherit disk->[async_]events.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Peter Osterlund <petero2@telia.com>
2011-03-09 19:54:28 +01:00
Tejun Heo 6fac80e3aa umem: Drop dummy ->media_changed()
umem doesn't implement media changed detection and there's no need to
implement dummy callback anymore.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:28 +01:00
Tejun Heo 3a200911ad xsysace: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

xsysace buffers media changed state and clears it on revalidation.  It
will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:28 +01:00
Tejun Heo aaa7c01546 ub: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

ub buffers media changed state and clears it on revalidation.  It will
behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
2011-03-09 19:54:28 +01:00
Tejun Heo 4bbde77787 swim[3]: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

Both swim and swim3 buffer media changed state and clear it on
revalidation.  They will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Laurent Vivier <laurent@lvivier.info>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-09 19:54:28 +01:00
Tejun Heo 507daea227 dac960: Convert to bdops->check_events()
Convert from ->media_changed() to ->check_events().

DAC960 media change notification seems to be one way (once set, never
cleared) and will generate spurious events when polled once the
condition triggers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:28 +01:00
Tejun Heo b1b56b93f3 paride: Convert to bdops->check_events()
Convert paride drivers from ->media_changed() to ->check_events().

pcd and pd buffer and clear events after reporting; however, pf
unconditionally reports MEDIA_CHANGE and will generate spurious events
when polled.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Tim Waugh <tim@cyberelk.net>
2011-03-09 19:54:28 +01:00
Tejun Heo 1a8a74f03f floppy,{ami|ata}flop: Convert to bdops->check_events()
Convert the floppy drivers from ->media_changed() to ->check_events().
Both floppy and ataflop buffer media changed state bit and clear them
on revalidation and will behave correctly with kernel event polling.

I can't tell how amiflop clears its event and it's possible that it
may generate spurious events when polled.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
2011-03-09 19:54:27 +01:00
Owen Smith 51de69523f xen: Union the blkif_request request specific fields
Prepare for extending the block device ring to allow request
specific fields, by moving the request specific fields for
reads, writes and barrier requests to a union member.

Acked-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-08 15:07:00 -05:00
Tejun Heo e83a46bbb1 Merge branch 'for-linus' of ../linux-2.6-block into block-for-2.6.39/core
This merge creates two set of conflicts.  One is simple context
conflicts caused by removal of throtl_scheduled_delayed_work() in
for-linus and removal of throtl_shutdown_timer_wq() in
for-2.6.39/core.

The other is caused by commit 255bb490c8 (block: blk-flush shouldn't
call directly into q->request_fn() __blk_run_queue()) in for-linus
crashing with FLUSH reimplementation in for-2.6.39/core.  The conflict
isn't trivial but the resolution is straight-forward.

* __blk_run_queue() calls in flush_end_io() and flush_data_end_io()
  should be called with @force_kblockd set to %true.

* elv_insert() in blk_kick_flush() should use
  %ELEVATOR_INSERT_REQUEUE.

Both changes are to avoid invoking ->request_fn() directly from
request completion path and closely match the changes in the commit
255bb490c8.

Signed-off-by: Tejun Heo <tj@kernel.org>
2011-03-04 19:09:02 +01:00
David S. Miller 0a0e9ae1bd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x.h
2011-03-03 21:27:42 -08:00
Patrick McHardy 01a16b21d6 netlink: kill eff_cap from struct netlink_skb_parms
Netlink message processing in the kernel is synchronous these days,
capabilities can be checked directly in security_netlink_recv() from
the current process.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Reviewed-by: James Morris <jmorris@namei.org>
[chrisw: update to include pohmelfs and uvesafb]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03 13:32:07 -08:00
Petr Uzel fd51469fb6 block: kill loop_mutex
Following steps lead to deadlock in kernel:

dd if=/dev/zero of=img bs=512 count=1000
losetup -f img
mkfs.ext2 /dev/loop0
mount -t ext2 -o loop /dev/loop0 mnt
umount mnt/

Stacktrace:
[<c102ec04>] irq_exit+0x36/0x59
[<c101502c>] smp_apic_timer_interrupt+0x6b/0x75
[<c127f639>] apic_timer_interrupt+0x31/0x38
[<c101df88>] mutex_spin_on_owner+0x54/0x5b
[<fe2250e9>] lo_release+0x12/0x67 [loop]
[<c10c4eae>] __blkdev_put+0x7c/0x10c
[<c10a4da5>] fput+0xd5/0x1aa
[<fe2250cf>] loop_clr_fd+0x1a9/0x1b1 [loop]
[<fe225110>] lo_release+0x39/0x67 [loop]
[<c10c4eae>] __blkdev_put+0x7c/0x10c
[<c10a59d9>] deactivate_locked_super+0x17/0x36
[<c10b6f37>] sys_umount+0x27e/0x2a5
[<c10b6f69>] sys_oldumount+0xb/0xe
[<c1002897>] sysenter_do_call+0x12/0x26
[<ffffffff>] 0xffffffff

Regression since 2a48fc0ab2, which introduced the private
loop_mutex as part of the BKL removal process.

As per [1], the mutex can be safely removed.

[1] http://www.gossamer-threads.com/lists/linux/kernel/1341930

Addresses: https://bugzilla.novell.com/show_bug.cgi?id=669394
Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=29172

Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
Cc: stable@kernel.org
Reviewed-by: Nikanth Karthikesan <knikanth@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-03 11:53:25 -05:00
Vivek Goyal cd25f54961 loop: No need to initialize ->queue_lock explicitly before calling blk_cleanup_queue()
Now we initialize ->queue_lock at queue allocation time so driver does
not have to worry about initializing it before calling
blk_cleanup_queue().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-02 19:06:49 -05:00
Grant Likely 1c48a5c93d dt: Eliminate of_platform_{,un}register_driver
Final step to eliminate of_platform_bus_type.  They're all just
platform drivers now.

v2: fix type in pasemi_nand.c (thanks to Stephen Rothwell)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-28 13:22:46 -07:00
Linus Torvalds 638691a7a4 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  md: Fix - again - partition detection when array becomes active
  Fix over-zealous flush_disk when changing device size.
  md: avoid spinlock problem in blk_throtl_exit
  md: correctly handle probe of an 'mdp' device.
  md: don't set_capacity before array is active.
  md: Fix raid1->raid0 takeover
2011-02-25 11:13:26 -08:00
Stefano Stabellini c80a420995 xen-blkfront: handle Xen major numbers other than XENVBD
This patch makes sure blkfront handles correctly virtual device numbers
corresponding to Xen emulated IDE and SCSI disks: in those cases
blkfront translates the major number to XENVBD and the minor number to a
low xvd minor.

Note: this behaviour is different from what old xenlinux PV guests used
to do: they used to steal an IDE or SCSI major number and use it
instead.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
2011-02-25 16:43:05 +00:00
NeilBrown 93b270f76e Fix over-zealous flush_disk when changing device size.
There are two cases when we call flush_disk.
In one, the device has disappeared (check_disk_change) so any
data will hold becomes irrelevant.
In the oter, the device has changed size (check_disk_size_change)
so data we hold may be irrelevant.

In both cases it makes sense to discard any 'clean' buffers,
so they will be read back from the device if needed.

In the former case it makes sense to discard 'dirty' buffers
as there will never be anywhere safe to write the data.  In the
second case it *does*not* make sense to discard dirty buffers
as that will lead to file system corruption when you simply enlarge
the containing devices.

flush_disk calls __invalidate_devices.
__invalidate_device calls both invalidate_inodes and invalidate_bdev.

invalidate_inodes *does* discard I_DIRTY inodes and this does lead
to fs corruption.

invalidate_bev *does*not* discard dirty pages, but I don't really care
about that at present.

So this patch adds a flag to __invalidate_device (calling it
__invalidate_device2) to indicate whether dirty buffers should be
killed, and this is passed to invalidate_inodes which can choose to
skip dirty inodes.

flusk_disk then passes true from check_disk_change and false from
check_disk_size_change.

dm avoids tripping over this problem by calling i_size_write directly
rathher than using check_disk_size_change.

md does use check_disk_size_change and so is affected.

This regression was introduced by commit 608aeef17a which causes
check_disk_size_change to call flush_disk, so it is suitable for any
kernel since 2.6.27.

Cc: stable@kernel.org
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NeilBrown <neilb@suse.de>
2011-02-24 17:25:47 +11:00
Jiri Kosina 0a9d59a246 Merge branch 'master' into for-next 2011-02-15 10:24:31 +01:00
Soren Hansen de1f016f88 nbd: remove module-level ioctl mutex
Commit 2a48fc0ab2 ("block: autoconvert trivial BKL users to private
mutex") replaced uses of the BKL in the nbd driver with mutex
operations.  Since then, I've been been seeing these lock ups:

 INFO: task qemu-nbd:16115 blocked for more than 120 seconds.
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 qemu-nbd      D 0000000000000001     0 16115  16114 0x00000004
  ffff88007d775d98 0000000000000082 ffff88007d775fd8 ffff88007d774000
  0000000000013a80 ffff8800020347e0 ffff88007d775fd8 0000000000013a80
  ffff880133730000 ffff880002034440 ffffea0004333db8 ffffffffa071c020
 Call Trace:
  [<ffffffff815b9997>] __mutex_lock_slowpath+0xf7/0x180
  [<ffffffff815b93eb>] mutex_lock+0x2b/0x50
  [<ffffffffa071a21c>] nbd_ioctl+0x6c/0x1c0 [nbd]
  [<ffffffff812cb970>] blkdev_ioctl+0x230/0x730
  [<ffffffff811967a1>] block_ioctl+0x41/0x50
  [<ffffffff81175c03>] do_vfs_ioctl+0x93/0x370
  [<ffffffff81175f61>] sys_ioctl+0x81/0xa0
  [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b

Instrumenting the nbd module's ioctl handler with some extra logging
clearly shows the NBD_DO_IT ioctl being invoked which is a long-lived
ioctl in the sense that it doesn't return until another ioctl asks the
driver to disconnect.  However, that other ioctl blocks, waiting for the
module-level mutex that replaced the BKL, and then we're stuck.

This patch removes the module-level mutex altogether.  It's clearly
wrong, and as far as I can see, it's entirely unnecessary, since the nbd
driver maintains per-device mutexes, and I don't see anything that would
require a module-level (or kernel-level, for that matter) mutex.

Signed-off-by: Soren Hansen <soren@linux2go.dk>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Paul Clements <paul.clements@steeleye.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@kernel.org>		[2.6.37.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-11 16:12:20 -08:00
Justin P. Mattock 8e572bab39 fix typos 'comamnd' -> 'command' in comments
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
2011-02-02 11:31:21 +01:00
Stephen M. Cameron 68264e9d67 cciss: make cciss_revalidate not loop through CISS_MAX_LUNS volumes unnecessarily.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-01-19 08:25:02 -07:00
Tracey Dent a0700bdd0b drivers/block/aoe/Makefile: replace the use of <module>-objs with <module>-y
Change Makefile to use <modules>-y instead of <modules>-objs because -objs
is deprecated and should now be switched.  According to
(documentation/kbuild/makefiles.txt).

Signed-off-by: Tracey Dent <tdent48227@gmail.com>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-01-19 08:25:02 -07:00
Sergey Senozhatsky ee71a96867 loop: queue_lock NULL pointer derefence in blk_throtl_exit
Performing
$ sudo mount -o loop -o umask=0 /dev/sdb1 /mnt/
mount: wrong fs type, bad option, bad superblock on /dev/loop0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

$ sudo modprobe -r loop

results in oops:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
 IP: [<ffffffff812479d4>] do_raw_spin_lock+0x14/0x122
 Process modprobe (pid: 6189, threadinfo ffff88009a898000, task ffff880154a88000)
 Call Trace:
  [<ffffffff81486788>] _raw_spin_lock_irq+0x4a/0x51
  [<ffffffff8123404b>] ? blk_throtl_exit+0x3b/0xa0
  [<ffffffff8105b120>] ? cancel_delayed_work_sync+0xd/0xf
  [<ffffffff8123404b>] blk_throtl_exit+0x3b/0xa0
  [<ffffffff81229bc8>] blk_release_queue+0x21/0x65
  [<ffffffff8123bb06>] kobject_release+0x51/0x66
  [<ffffffff8123bab5>] ? kobject_release+0x0/0x66
  [<ffffffff8123ce1e>] kref_put+0x43/0x4d
  [<ffffffff8123ba27>] kobject_put+0x47/0x4b
  [<ffffffff8122717c>] blk_cleanup_queue+0x56/0x5b
  [<ffffffffa01c3824>] loop_exit+0x68/0x844 [loop]
  [<ffffffff8107cccc>] sys_delete_module+0x1e8/0x25b
  [<ffffffff814864c9>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff81002112>] system_call_fastpath+0x16/0x1b

because of an attempt to acquire NULL queue_lock.
I added the same lines as in blk_queue_make_request -
index 44e18c0..49e6a54 100644`fall back to embedded per-queue lock'.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-01-19 08:25:02 -07:00
Tracey Dent 04de96c9c6 drivers/block/Makefile: replace the use of <module>-objs with <module>-y
Change Makefile to use <modules>-y instead of <modules>-objs because -objs
is deprecated and should now be switched.  According to
(documentation/kbuild/makefiles.txt).

Signed-off-by: Tracey Dent <tdent48227@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-01-19 08:25:02 -07:00
Linus Torvalds 7b0cb1bdac Merge branch 'for-2.6.38/drivers' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.38/drivers' of git://git.kernel.dk/linux-2.6-block:
  cciss: reinstate proper FIFO order of command queue list
  floppy: replace NO_GEOM macro with a function
2011-01-13 10:50:24 -08:00
Linus Torvalds 275220f0fc Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits)
  block: ensure that completion error gets properly traced
  blktrace: add missing probe argument to block_bio_complete
  block cfq: don't use atomic_t for cfq_group
  block cfq: don't use atomic_t for cfq_queue
  block: trace event block fix unassigned field
  block: add internal hd part table references
  block: fix accounting bug on cross partition merges
  kref: add kref_test_and_get
  bio-integrity: mark kintegrityd_wq highpri and CPU intensive
  block: make kblockd_workqueue smarter
  Revert "sd: implement sd_check_events()"
  block: Clean up exit_io_context() source code.
  Fix compile warnings due to missing removal of a 'ret' variable
  fs/block: type signature of major_to_index(int) to major_to_index(unsigned)
  block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p)
  cfq-iosched: don't check cfqg in choose_service_tree()
  fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors
  cdrom: export cdrom_check_events()
  sd: implement sd_check_events()
  sr: implement sr_check_events()
  ...
2011-01-13 10:45:01 -08:00
Linus Torvalds a170315420 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: fix cleanup when trying to mount inexistent image
  net/ceph: make ceph_msgr_wq non-reentrant
  ceph: fsc->*_wq's aren't used in memory reclaim path
  ceph: Always free allocated memory in osdmap_decode()
  ceph: Makefile: Remove unnessary code
  ceph: associate requests with opening sessions
  ceph: drop redundant r_mds field
  ceph: implement DIRLAYOUTHASH feature to get dir layout from MDS
  ceph: add dir_layout to inode
2011-01-13 10:25:24 -08:00
Yehuda Sadeh 766fc43973 rbd: fix cleanup when trying to mount inexistent image
Previously we didn't clean up the sysfs entry that was just
created.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-01-12 15:15:18 -08:00
Linus Torvalds 94d4c4cd56 Merge branch 'stable/xenbus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/xenbus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/xenbus: making backend support modular is too complex
  xen/pci: Make xen-pcifront be dependent on XEN_XENBUS_FRONTEND
  xen/xenbus: fixup checkpatch issues in xenbus_probe*
  xen/netfront: select XEN_XENBUS_FRONTEND
  xen/xenbus: clean up noise in xenbus_probe_frontend.c
  xen/xenbus: clean up noise in xenbus_probe_backend.c
  xen/xenbus: clean up noise in xenbus_probe.c
  xen/xenbus: cleanup debug noise in xenbus_comms.c
  xen/xenbus: clean up error handling
  xen/xenbus: make frontend bus GPL
  xen/xenbus: make sure backend bus is registered earlier
  xenbus/frontend: register bus earlier
  xen: remove xen/evtchn.h
  xen: add backend driver support
  xen: separate out frontend xenbus
2011-01-12 08:37:35 -08:00
Jens Axboe e6e1ee936d cciss: reinstate proper FIFO order of command queue list
Commit 8a3173de inadvertently changed the ordering when
switching to hlists. Change to regular list heads so we
can use tail list adds, this improves performance.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-01-10 21:50:33 +01:00
Linus Torvalds 23d69b09b7 Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits)
  usb: don't use flush_scheduled_work()
  speedtch: don't abuse struct delayed_work
  media/video: don't use flush_scheduled_work()
  media/video: explicitly flush request_module work
  ioc4: use static work_struct for ioc4_load_modules()
  init: don't call flush_scheduled_work() from do_initcalls()
  s390: don't use flush_scheduled_work()
  rtc: don't use flush_scheduled_work()
  mmc: update workqueue usages
  mfd: update workqueue usages
  dvb: don't use flush_scheduled_work()
  leds-wm8350: don't use flush_scheduled_work()
  mISDN: don't use flush_scheduled_work()
  macintosh/ams: don't use flush_scheduled_work()
  vmwgfx: don't use flush_scheduled_work()
  tpm: don't use flush_scheduled_work()
  sonypi: don't use flush_scheduled_work()
  hvsi: don't use flush_scheduled_work()
  xen: don't use flush_scheduled_work()
  gdrom: don't use flush_scheduled_work()
  ...

Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c
as per Tejun.
2011-01-07 16:58:04 -08:00
Ian Campbell 2de06cc1f1 xen: separate out frontend xenbus
Impact: refactor

Make a distinct frontend xenbus, in preparation for adding a backend xenbus.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[corresponds to 2fd433a4188f in git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git
 with adjustments to reflect changes in the code which is moved]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-05 16:29:17 -05:00
David S. Miller 17f7f4d9fc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/ipv4/fib_frontend.c
2010-12-26 22:37:05 -08:00
Tejun Heo 30d65030fd xen: don't use flush_scheduled_work()
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush info->work instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-12-24 15:59:06 +01:00
Tejun Heo 8aa0f41384 floppy: don't use flush_scheduled_work()
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush floppy_work instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
2010-12-24 15:59:06 +01:00
Linus Torvalds 453434cf3f Fix build error in drivers/block/cciss.c
.. caused by a missing semi-colon, introduced in commit 0fc13c8995
("cciss: fix cciss_revalidate panic").

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Thiago Farina <tfransosi@gmail.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-20 21:21:49 -08:00
Linus Torvalds 7f8635cc9e Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  cciss: fix cciss_revalidate panic
  block: max hardware sectors limit wrapper
  block: Deprecate QUEUE_FLAG_CLUSTER and use queue_limits instead
  blk-throttle: Correct the placement of smp_rmb()
  blk-throttle: Trim/adjust slice_end once a bio has been dispatched
  block: check for proper length of iov entries earlier in blk_rq_map_user_iov()
  drbd: fix for spin_lock_irqsave in endio callback
  drbd: don't recvmsg with zero length
2010-12-20 09:19:46 -08:00
Jens Axboe 3603b8eacc Fix compile warnings due to missing removal of a 'ret' variable
Commit a8adbe3 forgot to remove the return variable, kill it.

drivers/block/loop.c: In function 'lo_splice_actor':
drivers/block/loop.c:398: warning: unused variable 'ret'
[...]
fs/nfsd/vfs.c: In function 'nfsd_splice_actor':
fs/nfsd/vfs.c:848: warning: unused variable 'ret'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-20 09:15:19 +01:00
Stephen M. Cameron 0fc13c8995 cciss: fix cciss_revalidate panic
If you delete a logical drive, and then run BLKRRPART (e.g. via fdisk)
on a logical drive which is "after" the deleted logical drive in the h->drv[]
array, then cciss_revalidate panics because it will access the null pointer
h->drv[x] when x hits the deleted drive.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-17 09:01:37 +01:00
Michał Mirosław a8adbe378b fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors
This patch pulls calls to buf->ops->confirm() from all actors passed
(also indirectly) to splice_from_pipe_feed().

Is avoiding the call to buf->ops->confirm() while splice()ing to
/dev/null is an intentional optimization? No other user does that
and this will remove this special case.

Against current linux.git 6313e3c217.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-17 08:56:44 +01:00
Jeremy Fitzhardinge 667c78afae xen: Provide a variant of __RING_SIZE() that is an integer constant expression
Without this, gcc 4.5 won't compile xen-netfront and xen-blkfront, where
this is being used to specify array sizes.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Miller <davem@davemloft.net>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-15 12:34:28 -08:00
Linus Torvalds 04ed0978d5 Merge branch 'rbd-sysfs' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'rbd-sysfs' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: replace the rbd sysfs interface
2010-12-02 08:05:22 -08:00
Yehuda Sadeh dfc5606dc5 rbd: replace the rbd sysfs interface
The new interface creates directories per mapped image
and under each it creates a subdir per available snapshot.
This allows keeping a cleaner interface within the sysfs
guidelines. The ABI documentation was updated too.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-01 15:53:22 -08:00
Lars Ellenberg a115413de1 drbd: fix for spin_lock_irqsave in endio callback
In commit 9b7f76dc37919ea36caa9680a3f765e5b19b25fb,
 Author: Lars Ellenberg <lars.ellenberg@linbit.com>
 Date:   Wed Aug 11 23:40:24 2010 +0200

    drbd: new configuration parameter c-min-rate

a bad chunk slipped through, which is now reverted as well,
restoring the correct irqsave for the endio callback.

This patch also add comments at both req_mod()
and in the endio callback so it should not happen again.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-11-27 19:50:43 +01:00
Lars Ellenberg c13f7e1a94 drbd: don't recvmsg with zero length
This should fix a performance degradation we observed recently.

If we don't expect any subheader, we should not call into the tcp stack,
as that may add considerable latency if there is no data available at
this point.

For a synthetic synchronous write load with single outstanding writes,
this additional latency when processing the "unplug remote" packet
added up to a performance degradation factor >= 10.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-11-27 19:50:43 +01:00
Jens Axboe f30195c502 Merge branch 'cleanup-bd_claim' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into for-2.6.38/core 2010-11-27 19:49:18 +01:00
Linus Torvalds 78daa87b1d Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  cciss: fix build for PROC_FS disabled
  block: fix amiga and atari floppy driver compile warning
  blk-throttle: Fix calculation of max number of WRITES to be dispatched
  ioprio: grab rcu_read_lock in sys_ioprio_{set,get}()
  xen/blkfront: cope with backend that fail empty BLKIF_OP_WRITE_BARRIER requests
  xen/blkfront: Implement FUA with BLKIF_OP_WRITE_BARRIER
  xen/blkfront: change blk_shadow.request to proper pointer
  xen/blkfront: map REQ_FLUSH into a full barrier
2010-11-27 07:17:50 +09:00
Arnd Bergmann 451a3c24b0 BKL: remove extraneous #include <smp_lock.h>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-17 08:59:32 -08:00
Jens Axboe bbe425cd9a cciss: fix build for PROC_FS disabled
The recent patch to fix the removal of a non-existing proc
directory introduced this build problem for !CONFIG_PROC_FS:

drivers/block/cciss.c:4929: error: 'proc_cciss' undeclared (first use in this function)

Fix it by moving proc_cciss outside of the CONFIG_PROC_FS scope.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-17 11:56:13 +01:00
Jeff Garzik f281233d3e SCSI host lock push-down
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.

The patch below presents a simple SCSI host lock push-down as an
equivalent transformation.  No locking or other behavior should change
with this patch.  All existing bugs and locking orders are preserved.

Additionally, add one parameter to queuecommand,
	struct Scsi_Host *
and remove one parameter from queuecommand,
	void (*done)(struct scsi_cmnd *)

Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.

Minimal code disturbance was attempted with this change.  Most drivers
needed only two one-line modifications for their host lock push-down.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-16 13:33:23 -08:00
Vivek Goyal 3e9bb2a071 block: fix amiga and atari floppy driver compile warning
Geert, my crosstool don't produce warning below. I guess this has to do
something with compiler version.

- Geert noticed following warning during compilation.

  drivers/block/amiflop.c:1344: warning: ‘rq’ may be used uninitialized in
  this function
  drivers/block/ataflop.c:1402: warning: ‘rq’ may be used uninitialized in
  this function

- Initialize rq to NULL to fix the warning. If we can't find a suitable request
  to dispatch, this function should return NULL instead of a possibly garbage
  pointer.

- Cross compile tested only. Don't have hardware to test it.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-15 19:32:43 +01:00
David S. Miller c25ecd0a21 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-11-14 11:57:05 -08:00
Tejun Heo d4d7762995 block: clean up blkdev_get() wrappers and their users
After recent blkdev_get() modifications, open_by_devnum() and
open_bdev_exclusive() are simple wrappers around blkdev_get().
Replace them with blkdev_get_by_dev() and blkdev_get_by_path().

blkdev_get_by_dev() is identical to open_by_devnum().
blkdev_get_by_path() is slightly different in that it doesn't
automatically add %FMODE_EXCL to @mode.

All users are converted.  Most conversions are mechanical and don't
introduce any behavior difference.  There are several exceptions.

* btrfs now sets FMODE_EXCL in btrfs_device->mode, so there's no
  reason to OR it explicitly on blkdev_put().

* gfs2, nilfs2 and the generic mount_bdev() now set FMODE_EXCL in
  sb->s_mode.

* With the above changes, sb->s_mode now always should contain
  FMODE_EXCL.  WARN_ON_ONCE() added to kill_block_super() to detect
  errors.

The new blkdev_get_*() functions are with proper docbook comments.
While at it, add function description to blkdev_get() too.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Joern Engel <joern@lazybastard.org>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: reiserfs-devel@vger.kernel.org
Cc: xfs-masters@oss.sgi.com
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
2010-11-13 11:55:18 +01:00
Tejun Heo e525fd89d3 block: make blkdev_get/put() handle exclusive access
Over time, block layer has accumulated a set of APIs dealing with bdev
open, close, claim and release.

* blkdev_get/put() are the primary open and close functions.

* bd_claim/release() deal with exclusive open.

* open/close_bdev_exclusive() are combination of open and claim and
  the other way around, respectively.

* bd_link/unlink_disk_holder() to create and remove holder/slave
  symlinks.

* open_by_devnum() wraps bdget() + blkdev_get().

The interface is a bit confusing and the decoupling of open and claim
makes it impossible to properly guarantee exclusive access as
in-kernel open + claim sequence can disturb the existing exclusive
open even before the block layer knows the current open if for another
exclusive access.  Reorganize the interface such that,

* blkdev_get() is extended to include exclusive access management.
  @holder argument is added and, if is @FMODE_EXCL specified, it will
  gain exclusive access atomically w.r.t. other exclusive accesses.

* blkdev_put() is similarly extended.  It now takes @mode argument and
  if @FMODE_EXCL is set, it releases an exclusive access.  Also, when
  the last exclusive claim is released, the holder/slave symlinks are
  removed automatically.

* bd_claim/release() and close_bdev_exclusive() are no longer
  necessary and either made static or removed.

* bd_link_disk_holder() remains the same but bd_unlink_disk_holder()
  is no longer necessary and removed.

* open_bdev_exclusive() becomes a simple wrapper around lookup_bdev()
  and blkdev_get().  It also has an unexpected extra bdev_read_only()
  test which probably should be moved into blkdev_get().

* open_by_devnum() is modified to take @holder argument and pass it to
  blkdev_get().

Most of bdev open/close operations are unified into blkdev_get/put()
and most exclusive accesses are tested atomically at the open time (as
it should).  This cleans up code and removes some, both valid and
invalid, but unnecessary all the same, corner cases.

open_bdev_exclusive() and open_by_devnum() can use further cleanup -
rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop
special features.  Well, let's leave them for another day.

Most conversions are straight-forward.  drbd conversion is a bit more
involved as there was some reordering, but the logic should stay the
same.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Neil Brown <neilb@suse.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: dm-devel@redhat.com
Cc: drbd-dev@lists.linbit.com
Cc: Leo Chen <leochen@broadcom.com>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Joern Engel <joern@logfs.org>
Cc: reiserfs-devel@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
2010-11-13 11:55:17 +01:00
Linus Torvalds 8a9f772c14 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits)
  block: remove unused copy_io_context()
  Documentation: remove anticipatory scheduler info
  block: remove REQ_HARDBARRIER
  ioprio: rcu_read_lock/unlock protect find_task_by_vpid call (V2)
  ioprio: fix RCU locking around task dereference
  block: ioctl: fix information leak to userland
  block: read i_size with i_size_read()
  cciss: fix proc warning on attempt to remove non-existant directory
  bio: take care not overflow page count when mapping/copying user data
  block: limit vec count in bio_kmalloc() and bio_alloc_map_data()
  block: take care not to overflow when calculating total iov length
  block: check for proper length of iov entries in blk_rq_map_user_iov()
  cciss: remove controllers supported by hpsa
  cciss: use usleep_range not msleep for small sleeps
  cciss: limit commands allocated on reset_devices
  cciss: Use kernel provided PCI state save and restore functions
  cciss: fix board status waiting code
  drbd: Removed checks for REQ_HARDBARRIER on incomming BIOs
  drbd: REQ_HARDBARRIER -> REQ_FUA transition for meta data accesses
  drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch code
  ...
2010-11-12 08:52:47 -08:00
Jens Axboe 1ff5125fb8 Merge branch 'upstream/blkfront' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into for-linus
Conflicts:
	drivers/block/xen-blkfront.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-12 08:47:04 +01:00
Christoph Hellwig 02e031cbc8 block: remove REQ_HARDBARRIER
REQ_HARDBARRIER is dead now, so remove the leftovers.  What's left
at this point is:

 - various checks inside the block layer.
 - sanity checks in bio based drivers.
 - now unused bio_empty_barrier helper.
 - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
   but Xen really needs to sort out it's barrier situaton.
 - setting of ordered tags in uas - dead code copied from old scsi
   drivers.
 - scsi different retry for barriers - it's dead and should have been
   removed when flushes were converted to FS requests.
 - blktrace handling of barriers - removed.  Someone who knows blktrace
   better should add support for REQ_FLUSH and REQ_FUA, though.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-10 14:54:09 +01:00
Jens Axboe 00e375e7e9 Merge branch 'for-2.6.37/drivers' into for-linus
Conflicts:
	drivers/block/cciss.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-10 14:51:27 +01:00
Mike Snitzer 77304d2aba block: read i_size with i_size_read()
Convert direct reads of an inode's i_size to using i_size_read().

i_size_{read,write} use a seqcount to protect reads from accessing
incomple writes.  Concurrent i_size_write()s require mutual exclussion
to protect the seqcount that is used by i_size_{read,write}.  But
i_size_read() callers do not need to use additional locking.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: NeilBrown <neilb@suse.de>
Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-10 14:40:53 +01:00
Jens Axboe 90fdb0b98a cciss: fix proc warning on attempt to remove non-existant directory
Randy reports that he gets the following stack trace when
removing the cciss module:

[  109.164277] Pid: 3463, comm: rmmod Not tainted 2.6.37-rc1 #7
[  109.164280] Call Trace:
[  109.164292]  [<ffffffff8107eb8d>] warn_slowpath_common+0xc6/0xf3
[  109.164299]  [<ffffffff8107ecaa>] warn_slowpath_fmt+0x5b/0x6b
[  109.164307]  [<ffffffff8155175b>] ? _raw_spin_unlock+0x40/0x4b
[  109.164313]  [<ffffffff8123dd1e>] remove_proc_entry+0x156/0x35e
[  109.164320]  [<ffffffff812cd91b>] ? do_raw_spin_unlock+0xff/0x10f
[  109.164327]  [<ffffffff8113823d>] ? trace_hardirqs_on+0x10/0x4a
[  109.164333]  [<ffffffff8155162d>] ? _raw_spin_unlock_irq+0x4c/0x7b
[  109.164339]  [<ffffffff8154d4d1>] ? wait_for_common+0x145/0x15e
[  109.164345]  [<ffffffff81075337>] ? default_wake_function+0x0/0x22
[  109.164357]  [<ffffffffa0615a8f>] cciss_cleanup+0xa9/0xc7 [cciss]
[  109.164365]  [<ffffffff810d3cb0>] sys_delete_module+0x2d6/0x368
[  109.164371]  [<ffffffff8155036b>] ? lockdep_sys_exit_thunk+0x35/0x67
[  109.164377]  [<ffffffff810fdfaf>] ? audit_syscall_entry+0x172/0x1a5
[  109.164383]  [<ffffffff815502f5>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  109.164389]  [<ffffffff8100ea72>] system_call_fastpath+0x16/0x1b
[  109.164394] ---[ end trace 88e8568246ed0b1d ]---

which will happen if you don't actually have an HP CISS adapter,
since it'll do an uncondional removal of a proc directory it
never attempted to create in that case.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-10 14:40:52 +01:00
Eric Dumazet 840a185ddd aoe: remove dev_base_lock use from aoecmd_cfg_pkts()
dev_base_lock is the legacy way to lock the device list, and is planned
to disappear. (writers hold RTNL, readers hold RCU lock)

Convert aoecmd_cfg_pkts() to RCU locking.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-08 13:50:07 -08:00
Pekka Enberg 2b51dca79a floppy: replace NO_GEOM macro with a function
This patch replaces the NO_GEOM macro with a proper static inline function and
converts an open-coded caller in check_floppy_change() to use it.

Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-08 14:44:34 +01:00
Vivek Goyal d017bf6b4f floppy: fix another use-after-free
While scanning the floopy code due to c093ee4f07 ("floppy: fix
use-after-free in module load failure path"), I found one more instance
of trying to access disk->queue pointer after doing put_disk() on
gendisk.  For some reason , floppy moule still loads/unloads fine.  The
object is probably still around with right pointer values.

 o There seems to be one more instance of trying to cleanup the request
   queue after we have called put_disk() on associated gendisk.

 o This fix is more out of code inspection.  Even without this fix for
   some reason I am able to load/unload floppy module without any
   issues.

 o Floppy module loads/unloads fine after the fix.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-06 07:49:56 -07:00
Linus Torvalds c093ee4f07 floppy: fix use-after-free in module load failure path
Commit 488211844e ("floppy: switch to one queue per drive instead of
sharing a queue") introduced a use-after-free.  We do "put_disk()" on
the disk device _before_ we then clean up the queue associated with that
disk.

Move the put_disk() down to avoid dereferencing a free'd data structure.

Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Reported-and-tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-05 17:45:59 -07:00
Jeremy Fitzhardinge dcb8baecea xen/blkfront: cope with backend that fail empty BLKIF_OP_WRITE_BARRIER requests
Some(?) Xen block backends fail BLKIF_OP_WRITE_BARRIER requests, which
Linux uses as a cache flush operation.  In that case, disable use
of FLUSH.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
2010-11-02 13:46:46 -04:00
Jeremy Fitzhardinge be2f8373c1 xen/blkfront: Implement FUA with BLKIF_OP_WRITE_BARRIER
The BLKIF_OP_WRITE_BARRIER is a full ordered barrier, so we can use it
to implement FUA as well as a plain FLUSH.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2010-11-02 11:27:59 -04:00
Jeremy Fitzhardinge a945b9801a xen/blkfront: change blk_shadow.request to proper pointer
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-11-02 11:27:58 -04:00
Jeremy Fitzhardinge c64e38ea17 xen/blkfront: map REQ_FLUSH into a full barrier
Implement a flush as a full barrier, since we have nothing weaker.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2010-11-02 10:43:51 -04:00
Linus Torvalds 18cb657ca1 Merge branch 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm

* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
  xen: register xen pci notifier
  xen: initialize cpu masks for pv guests in xen_smp_init
  xen: add a missing #include to arch/x86/pci/xen.c
  xen: mask the MTRR feature from the cpuid
  xen: make hvc_xen console work for dom0.
  xen: add the direct mapping area for ISA bus access
  xen: Initialize xenbus for dom0.
  xen: use vcpu_ops to setup cpu masks
  xen: map a dummy page for local apic and ioapic in xen_set_fixmap
  xen: remap MSIs into pirqs when running as initial domain
  xen: remap GSIs as pirqs when running as initial domain
  xen: introduce XEN_DOM0 as a silent option
  xen: map MSIs into pirqs
  xen: support GSI -> pirq remapping in PV on HVM guests
  xen: add xen hvm acpi_register_gsi variant
  acpi: use indirect call to register gsi in different modes
  xen: implement xen_hvm_register_pirq
  xen: get the maximum number of pirqs from xen
  xen: support pirq != irq

* 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits)
  X86/PCI: Remove the dependency on isapnp_disable.
  xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c
  MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright.
  x86: xen: Sanitse irq handling (part two)
  swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it.
  MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer.
  xen/pci: Request ACS when Xen-SWIOTLB is activated.
  xen-pcifront: Xen PCI frontend driver.
  xenbus: prevent warnings on unhandled enumeration values
  xenbus: Xen paravirtualised PCI hotplug support.
  xen/x86/PCI: Add support for the Xen PCI subsystem
  x86: Introduce x86_msi_ops
  msi: Introduce default_[teardown|setup]_msi_irqs with fallback.
  x86/PCI: Export pci_walk_bus function.
  x86/PCI: make sure _PAGE_IOMAP it set on pci mappings
  x86/PCI: Clean up pci_cache_line_size
  xen: fix shared irq device passthrough
  xen: Provide a variant of xen_poll_irq with timeout.
  xen: Find an unbound irq number in reverse order (high to low).
  xen: statically initialize cpu_evtchn_mask_p
  ...

Fix up trivial conflicts in drivers/pci/Makefile
2010-10-28 17:11:17 -07:00
Mike Miller 6fa9775208 cciss: remove overlapping PCI IDs
This patch removes the controller overlap between cciss and hpsa. It was
decided that no overlap should exist. All new controllers will use the hpsa
SCSI based driver.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-28 06:33:27 -06:00
Vasiliy Kulikov 7ab5118d7c block: cciss: fix information leak to userland
Structure IOCTL_Command_struct is copied to userland with
some padding fields at the end of the struct unitialized.
It leads to leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-28 06:31:55 -06:00
Andrew Morton 027b180d74 drivers/block/aoe/aoeblk.c: ratelimit a warning printk
As described in https://bugzilla.kernel.org/show_bug.cgi?id=19922

: I had an AoE device go down overnight, and while a server was trying to
: write to it, it was also writing this message to its logs:
:
: 209                 printk(KERN_INFO "aoe: device %ld.%d is not up\n",
: 210                         d->aoemajor, d->aoeminor);
:
: The message appeared many times per second, and over several hours
: produced about 7.5 gigabytes of log files, filling up all free space on
: the root filesystem.

Cc: "Ed L. Cashin" <ecashin@coraid.com>
Suggested-by: Roman Mamedov <roman@rm.pp.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-28 06:15:26 -06:00
Geert Uytterhoeven e1fbd9210d drivers/block/z2ram.c: correct printing of sector_t
If CONFIG_LBDAF=y, `sector_t' becomes `u64' instead of `unsigned long':

drivers/block/z2ram.c: In function ¡do_z2_request¢:
drivers/block/z2ram.c:83: warning: format %lu expects type `long unsigned int', but argument 2 has type `sector_t'

Hence always cast it to `unsigned long long' for printing.  Also do the
pr_err() dance, while we're at it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-28 06:15:26 -06:00
Tejun Heo 5ad21a3374 aoe: don't use flush_scheduled_work()
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly cancel aoedev->work on free instead of depending on
flush_scheduled_works().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-28 06:15:26 -06:00
Nicolas Kaiser 2027ae1fa9 drivers/block/drbd/drbd_main.c: fix error path
Failure to create drbd_ee_mempool appears not to get checked.  Looks like
a copy-and-paste problem to me.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-28 06:15:26 -06:00
Milan Broz 51a0bb0c2e loop: Properly clear sysfs in autoclear mode
In autoclear mode bdev is NULL but the sysfs
entry should be destroyed otherwise this warning appears:

WARNING: at fs/sysfs/dir.c:451 sysfs_add_one+0x82/0x95()
sysfs: cannot create duplicate filename '/devices/virtual/block/loop0/loop'

Fixes commit ee86273062

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-27 19:51:30 -06:00
Peter Zijlstra 61ecdb801e mm: strictly nested kmap_atomic()
Ensure kmap_atomic() usage is strictly nested

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:08 -07:00
Linus Torvalds 51f00a471c Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
  mtd/m25p80: add support to parse the partitions by OF node
  of/irq: of_irq.c needs to include linux/irq.h
  of/mips: Cleanup some include directives/files.
  of/mips: Add device tree support to MIPS
  of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch
  of/device: Rework to use common platform_device_alloc() for allocating devices
  of/xsysace: Fix OF probing on little-endian systems
  of: use __be32 types for big-endian device tree data
  of/irq: remove references to NO_IRQ in drivers/of/platform.c
  of/promtree: add package-to-path support to pdt
  of/promtree: add of_pdt namespace to pdt code
  of/promtree: no longer call prom_ functions directly; use an ops structure
  of/promtree: make drivers/of/pdt.c no longer sparc-only
  sparc: break out some PROM device-tree building code out into drivers/of
  of/sparc: convert various prom_* functions to use phandle
  sparc: stop exporting openprom.h header
  powerpc, of_serial: Endianness issues setting up the serial ports
  of: MTD: Fix OF probing on little-endian systems
  of: GPIO: Fix OF probing on little-endian systems
2010-10-25 08:19:14 -07:00
Stephen M. Cameron 4205df3400 cciss: remove controllers supported by hpsa
We would prefer not to have any overlap between the two drivers.
Remove the cciss_allow_hpsa option, as it it is no longer needed.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-23 18:47:31 +02:00
Stephen M. Cameron 332c2f80a8 cciss: use usleep_range not msleep for small sleeps
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-23 18:45:09 +02:00
Stephen M. Cameron 186fb9cf6a cciss: limit commands allocated on reset_devices
This is to conserve memory in a memory-limited kdump scenario

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-23 18:45:08 +02:00
Stephen M. Cameron f442e64b93 cciss: Use kernel provided PCI state save and restore functions
and use the doorbell reset method if available (which doesn't
lock up the controller if you properly save and restore all
the PCI registers that you're supposed to.)

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-23 18:45:07 +02:00
Stephen M. Cameron afa842fa64 cciss: fix board status waiting code
After a reset, we should first wait for the board to become "not ready",
and then wait for it to become "ready", instead of immediately
waiting for it to become "ready", and do this waiting *after*
restoring PCI config space registers.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-23 18:45:06 +02:00
Jens Axboe 53c2eb24ff Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-2.6.37/drivers 2010-10-23 18:43:55 +02:00
Philipp Reisner 650789c87f drbd: Removed checks for REQ_HARDBARRIER on incomming BIOs
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-23 13:02:34 +02:00
Philipp Reisner a8a4e51e69 drbd: REQ_HARDBARRIER -> REQ_FUA transition for meta data accesses
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-23 13:01:45 +02:00
Philipp Reisner 2451fc3b2b drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-23 13:00:48 +02:00
Linus Torvalds 5cc1035062 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (141 commits)
  USB: mct_u232: fix broken close
  USB: gadget: amd5536udc.c: fix error path
  USB: imx21-hcd - fix off by one resource size calculation
  usb: gadget: fix Kconfig warning
  usb: r8a66597-udc: Add processing when USB was removed.
  mxc_udc: add workaround for ENGcm09152 for i.MX35
  USB: ftdi_sio: add device ids for ScienceScope
  USB: musb: AM35x: Workaround for fifo read issue
  USB: musb: add musb support for AM35x
  USB: AM35x: Add musb support
  usb: Fix linker errors with CONFIG_PM=n
  USB: ohci-sh - use resource_size instead of defining its own resource_len macro
  USB: isp1362-hcd - use resource_size instead of defining its own resource_len macro
  USB: isp116x-hcd - use resource_size instead of defining its own resource_len macro
  USB: xhci: Fix compile error when CONFIG_PM=n
  USB: accept some invalid ep0-maxpacket values
  USB: xHCI: PCI power management implementation
  USB: xHCI: bus power management implementation
  USB: xHCI: port remote wakeup implementation
  USB: xHCI: port power management implementation
  ...

Manually fix up (non-data) conflict: the SCSI merge gad renamed the
'hw_sector_size' member to 'physical_block_size', and the USB tree
brought a new use of it.
2010-10-22 20:30:48 -07:00
Linus Torvalds a2887097f2 Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block: (46 commits)
  xen-blkfront: disable barrier/flush write support
  Added blk-lib.c and blk-barrier.c was renamed to blk-flush.c
  block: remove BLKDEV_IFL_WAIT
  aic7xxx_old: removed unused 'req' variable
  block: remove the BH_Eopnotsupp flag
  block: remove the BLKDEV_IFL_BARRIER flag
  block: remove the WRITE_BARRIER flag
  swap: do not send discards as barriers
  fat: do not send discards as barriers
  ext4: do not send discards as barriers
  jbd2: replace barriers with explicit flush / FUA usage
  jbd2: Modify ASYNC_COMMIT code to not rely on queue draining on barrier
  jbd: replace barriers with explicit flush / FUA usage
  nilfs2: replace barriers with explicit flush / FUA usage
  reiserfs: replace barriers with explicit flush / FUA usage
  gfs2: replace barriers with explicit flush / FUA usage
  btrfs: replace barriers with explicit flush / FUA usage
  xfs: replace barriers with explicit flush / FUA usage
  block: pass gfp_mask and flags to sb_issue_discard
  dm: convey that all flushes are processed as empty
  ...
2010-10-22 17:07:18 -07:00
Linus Torvalds 8abfc6e7a4 Merge branch 'for-2.6.37/drivers' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.37/drivers' of git://git.kernel.dk/linux-2.6-block: (95 commits)
  cciss: fix PCI IDs for new Smart Array controllers
  drbd: add race-breaker to drbd_go_diskless
  drbd: use dynamic_dev_dbg to optionally log uuid changes
  dynamic_debug.h: Fix dynamic_dev_dbg() macro if CONFIG_DYNAMIC_DEBUG not set
  drbd: cleanup: change "<= 0" to "== 0"
  drbd: relax the grace period of the md_sync timer again
  drbd: add some more explicit drbd_md_sync
  drbd: drop wrong debug asserts, fix recently introduced race
  drbd: cleanup useless leftover warn/error printk's
  drbd: add explicit drbd_md_sync to drbd_resync_finished
  drbd: Do not log an ASSERT for P_OV_REQUEST packets while C_CONNECTED
  drbd: fix for possible deadlock on IO error during resync
  drbd: fix unlikely access after free and list corruption
  drbd: fix for spurious fullsync (uuids rotated too fast)
  drbd: allow for explicit resync-finished notifications
  drbd: preparation commit, using full state in receive_state()
  drbd: drbd_send_ack_dp must not rely on header information
  drbd: Fix regression in recv_bm_rle_bits (compressed bitmap)
  drbd: Fixed a stupid copy and paste error
  drbd: Allow larger values for c-fill-target.
  ...

Fix up trivial conflict in drivers/block/ataflop.c due to BKL removal
2010-10-22 17:03:12 -07:00
Linus Torvalds e9dd2b6837 Merge branch 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block: (39 commits)
  cfq-iosched: Fix a gcc 4.5 warning and put some comments
  block: Turn bvec_k{un,}map_irq() into static inline functions
  block: fix accounting bug on cross partition merges
  block: Make the integrity mapped property a bio flag
  block: Fix double free in blk_integrity_unregister
  block: Ensure physical block size is unsigned int
  blkio-throttle: Fix possible multiplication overflow in iops calculations
  blkio-throttle: limit max iops value to UINT_MAX
  blkio-throttle: There is no need to convert jiffies to milli seconds
  blkio-throttle: Fix link failure failure on i386
  blkio: Recalculate the throttled bio dispatch time upon throttle limit change
  blkio: Add root group to td->tg_list
  blkio: deletion of a cgroup was causes oops
  blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n
  block: set the bounce_pfn to the actual DMA limit rather than to max memory
  block: revert bad fix for memory hotplug causing bounces
  Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK
  block: set the bounce_pfn to the actual DMA limit rather than to max memory
  block: Prevent hang_check firing during long I/O
  cfq: improve fsync performance for small files
  ...

Fix up trivial conflicts due to __rcu sparse annotation in include/linux/genhd.h
2010-10-22 17:00:32 -07:00
Stefano Stabellini 67ba37293e Merge commit 'konrad/stable/xen-pcifront-0.8.2' into 2.6.36-rc8-initial-domain-v6 2010-10-22 21:24:06 +01:00
Linus Torvalds 092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Linus Torvalds c37927d435 Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  block: autoconvert trivial BKL users to private mutex
  drivers: autoconvert trivial BKL users to private mutex
  ipmi: autoconvert trivial BKL users to private mutex
  mac: autoconvert trivial BKL users to private mutex
  mtd: autoconvert trivial BKL users to private mutex
  scsi: autoconvert trivial BKL users to private mutex

Fix up trivial conflicts (due to addition of private mutex right next to
deletion of a version string) in drivers/char/pcmcia/cm40[04]0_cs.c
2010-10-22 10:49:54 -07:00
Michal Nazarewicz 8fa7fd74ef USB: storage: Use USB_ prefix instead of US_ prefix
This commit changes prefix for some of the USB mass storage
class related macros (ie. USB_SC_ for subclass and USB_PR_
for class).

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-22 10:21:49 -07:00
Philipp Reisner 8825f7c3e5 drbd: Silenced an assert
That assertion's condition needed adjustment for today's semantics

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-22 15:55:22 +02:00
Lars Ellenberg fb2c7a10ee drbd: rate limit an error message
If we don't rate limit it, and you happen to log err level messages via
serial console, an IO error on a disconnected Primary may cause serious
unresponsiveness.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-22 15:53:10 +02:00
Lars Ellenberg bc571b8cb9 drbd: fix a misleading printk
This codepath used to be called only for failed kmalloc GFP_ATOMIC,
but is now also triggered by other things.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-22 15:51:22 +02:00
Lars Ellenberg 6719fb036c drbd: fix potential data divergence after multiple failures
If we get an IO-error during an activity log transaction,
if we failed to write the bitmap of the evicted extent,
we must not write the transaction itself.
If we failed to write the transaction,
we must not even submit the corresponding bio,
as its extent is not yet marked in the activity log.

Otherwise, if this was a disconneted Primary (degraded cluster), which
now lost its disk as well, and we later re-attach the same backend
storage, we possibly "forget" to resync some parts of the disk that
potentially have been changed.

On the receiving side, when receiving from a peer with unhealthy disk,
checking for pdsk == D_DISKLESS is not enough, we need to set out of
sync and do AL transactions for everything pdsk < D_INCONSISTENT on the
receiving side.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-22 15:50:27 +02:00
Lars Ellenberg 82f59cc635 drbd: fix potential deadlock on detach
If we have contention in drbd_al_begin_iod (heavy randon IO),
an administrative request to detach the disk may deadlock
for similar reasons as the recently fixed deadlock if detaching
because of IO-error.

The approach taken here is to either go through the intermediate
cleanup state D_FAILED, or first lock out application io,
don't just go directly to D_DISKLESS.

We need an additional state bit (WAS_IO_ERROR) to distinguish
the -> D_FAILED because of IO-error from other failures.

Sanitize D_ATTACHING -> D_FAILED to D_ATTACHING -> D_DISKLESS.
If only attaching, ldev may be missing still, but would be referenced
from within the after_state_ch for -> D_FAILED, potentially
dereferencing a NULL pointer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-22 15:46:11 +02:00
Lars Ellenberg 3beec1d446 drbd: tag a few error messages with "assert failed"
If those messages ever get logged, clearly state that they are
actually failed ASSERTS, so our regression tests can pick them up
from the logs more easily.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-22 15:41:20 +02:00
Lars Ellenberg aaa8e2b34c drbd: consolidate explicit drbd_md_sync into drbd_create_new_uuid
Every code path changing the current UUID needs to get it on stable
storage anyways. Flush it to disk right there, remove the now obsolte
explicit drbd_md_sync statements in the other code paths.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-22 15:36:56 +02:00
Jens Axboe 005a1d15f5 xen-blkfront: disable barrier/flush write support
The driver doesn't handle empty flushes. Disable barrier/flush
write support until this is fixed up.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-22 10:58:33 +02:00
Linus Torvalds 94ebd235c4 Merge branch 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio_blk: remove BKL leftovers
  virtio: console: Disable lseek(2) for port file operations
  virtio: console: Send SIGIO in case of port unplug
  virtio: console: Send SIGIO on new data arrival on ports
  virtio: console: Send SIGIO to processes that request it for host events
  virtio: console: Reference counting portdev structs is not needed
  virtio: console: Add reference counting for port struct
  virtio: console: Use cdev_alloc() instead of cdev_init()
  virtio: console: Add a find_port_by_devt() function
  virtio: console: Add a list of portdevs that are active
  virtio: console: open: Use a common path for error handling
  virtio: console: remove_port() should return void
  virtio: console: Make write() return -ENODEV on hot-unplug
  virtio: console: Make read() return -ENODEV on hot-unplug
  virtio: console: Unblock poll on port hot-unplug
  virtio: console: Un-block reads on chardev close
  virtio: console: Check if portdev is valid in send_control_msg()
  virtio: console: Remove control vq data only if using multiport support
  virtio: console: Reset vdev before removing device
2010-10-21 12:40:33 -07:00
Linus Torvalds 2017bd1945 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (22 commits)
  ceph: do not carry i_lock for readdir from dcache
  fs/ceph/xattr.c: Use kmemdup
  rbd: passing wrong variable to bvec_kunmap_irq()
  rbd: null vs ERR_PTR
  ceph: fix num_pages_free accounting in pagelist
  ceph: add CEPH_MDS_OP_SETDIRLAYOUT and associated ioctl.
  ceph: don't crash when passed bad mount options
  ceph: fix debugfs warnings
  block: rbd: removing unnecessary test
  block: rbd: fixed may leaks
  ceph: switch from BKL to lock_flocks()
  ceph: preallocate flock state without locks held
  ceph: add pagelist_reserve, pagelist_truncate, pagelist_set_cursor
  ceph: use mapping->nrpages to determine if mapping is empty
  ceph: only invalidate on check_caps if we actually have pages
  ceph: do not hide .snap in root directory
  rbd: introduce rados block device (rbd), based on libceph
  ceph: factor out libceph from Ceph file system
  ceph-rbd: osdc support for osd call and rollback operations
  ceph: messenger and osdc changes for rbd
  ...
2010-10-21 12:38:28 -07:00
Christoph Hellwig fe5a50a10c virtio_blk: remove BKL leftovers
Remove the BKL usage added in "block: push down BKL into .locked_ioctl".
Virtio-blk doesn't use the BKL for anything, and doesn't implement any
ioctl command by itself, but only uses the generic scsi_cmd_ioctl
which is fine without the BKL.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-10-21 17:44:05 +10:30
Dan Carpenter 85b5aaa624 rbd: passing wrong variable to bvec_kunmap_irq()
We should be passing "buf" here insead of "bv".  This is tricky because
it's not the same as kmap() and kunmap().  GCC does warn about it if you
compile on i386 with CONFIG_HIGHMEM.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20 15:38:25 -07:00
Dan Carpenter b8d0638a98 rbd: null vs ERR_PTR
ceph_alloc_page_vector() returns ERR_PTR(-ENOMEM) on errors.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20 15:38:24 -07:00
Yehuda Sadeh f4cf3deef4 block: rbd: removing unnecessary test
rbd_get_segment() can't return a negative value, we don't need to check
the return output.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-10-20 15:38:20 -07:00
Vasiliy Kulikov 28f259b7cd block: rbd: fixed may leaks
rbd_client_create() doesn't free rbdc, this leads to many leaks.

seg_len in rbd_do_op() is unsigned, so (seg_len < 0) makes no sense.
Also if fixed check fails then seg_name is leaked.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-10-20 15:38:19 -07:00
Yehuda Sadeh 602adf4002 rbd: introduce rados block device (rbd), based on libceph
The rados block device (rbd), based on osdblk, creates a block device
that is backed by objects stored in the Ceph distributed object storage
cluster.  Each device consists of a single metadata object and data
striped over many data objects.

The rbd driver supports read-only snapshots.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20 15:38:13 -07:00
Mike Miller 6362beea89 cciss: fix PCI IDs for new Smart Array controllers
cciss: fix PCI IDs for new controllers

This patch fixes the botched up PCI IDs of new controllers. Please consider
this patch for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-19 09:40:34 +02:00
Jens Axboe fa251f8990 Merge branch 'v2.6.36-rc8' into for-2.6.37/barrier
Conflicts:
	block/blk-core.c
	drivers/block/loop.c
	mm/swapfile.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-19 09:13:04 +02:00
Michal Simek bda80da469 of/xsysace: Fix OF probing on little-endian systems
Convert big-endian DTB to little-endian if necessary.

Signed-off-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-10-18 09:50:09 -06:00
Noboru Iwamatsu b78c951256 xenbus: prevent warnings on unhandled enumeration values
XenbusStateReconfiguring/XenbusStateReconfigured were introduced by
c/s 437, but aren't handled in many switch statements.

.. also pulled from the linux-2.6-sparse-tree tree.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-10-18 10:49:36 -04:00
Arnd Bergmann 6038f373a3 llseek: automatically add .llseek fop
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15 15:53:27 +02:00
Lars Ellenberg 5dbfe7aedf drbd: add race-breaker to drbd_go_diskless
This adds a necessary race breaker to these commits:
    drbd: fix for possible deadlock on IO error during resync
    drbd: drop wrong debug asserts, fix recently introduced race

What we do is get a refcount, check the state, then depending on the
state and the requested minimum disk state, either hold it (success),
or give it back immediately (failed "try lock").

Some code paths (flushing of drbd metadata) may still grab and hold a
refcount even if we are D_FAILED (application IO won't).
So even if we hit local_cnt == 0 once after being D_FAILED,
we still need to wait for that again after we changed to D_DISKLESS.
Once local_cnt reaches 0 while we are D_DISKLESS, we can be sure that
no one will look at the protected members anymore, so only then is it
safe to free them.

We cannot easily convert to standard locking primitives here, as we want
to be able to use it in atomic context (we always do a "try lock"),
as well as hold references for a "long time" (from IO submission to
completion callback).

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-15 14:06:53 +02:00
Lars Ellenberg ac7241211d drbd: use dynamic_dev_dbg to optionally log uuid changes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-15 10:52:42 +02:00
Dan Carpenter 2265769531 drbd: cleanup: change "<= 0" to "== 0"
dt is unsigned so it's never less than zero.  We are calculating the
elapsed time, and that's never less than zero (unless there is a bug or
we invent time travel).  The comparison here is just to guard against
divide by zero bugs.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2010-10-14 19:17:23 +02:00
Lars Ellenberg ca0e6098aa drbd: relax the grace period of the md_sync timer again
Consolidate the ifdef's for the debug level, accidentally the used both
DEBUG and DRBD_DEBUG_MD_SYNC.  Default to off.

For production, we can safely reduce the grace period for this timer
again the the value we used to have.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 19:15:38 +02:00
Lars Ellenberg 856c50c7b6 drbd: add some more explicit drbd_md_sync
It sometimes may take a while for the after state change work to be
scheduled, which does drbd_md_sync. At convenient places, we should do
explicit drbd_md_sync to have the new state information on disk as soon
as possible.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 19:08:58 +02:00
Lars Ellenberg 9d282875d8 drbd: drop wrong debug asserts, fix recently introduced race
commit 2372c38caadeaebc68a5ee190782c2a0df01edc3
 drbd: fix for possible deadlock on IO error during resync

introduced a new ASSERT, which turns out to be wrong. Drop it.

Also serialize the state change to D_DISKLESS with the after state
change work of the -> D_FAILED transition, don't open a new race.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 19:08:32 +02:00
Lars Ellenberg 0f8488e160 drbd: cleanup useless leftover warn/error printk's
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:53 +02:00
Lars Ellenberg 13d42685be drbd: add explicit drbd_md_sync to drbd_resync_finished
As we usually update the generation UUIDs here, we should explicitly
sync them to disk.  So far this has been done only implicitly by related
code paths.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:52 +02:00
Philipp Reisner b18b37befb drbd: Do not log an ASSERT for P_OV_REQUEST packets while C_CONNECTED
This might happen if on the VERIFY_S node the disk gets dropped.
Although this is an cluster wide state transition, the VERIFY_T node,
updates it connection state first. Then the ack packet for the
cluster wide state transition travels back, and the VERIFY_S node
stops to produce the P_OV_REQUEST packets.

There is absolutely nothing wrong with that.

Further, do not log "Can not satisfy peer's..." on the VERIFY_S
node in this case, but pretend that they had equal checksum.

[Bugz 327]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:51 +02:00
Lars Ellenberg e9e6f3ec53 drbd: fix for possible deadlock on IO error during resync
Scenario:

Something (say, flush-147:0) is in drbd_al_begin_io,
holding a local_cnt, waiting for the resync to make progress.

Disk fails, worker in after_state_ch does drbd_rs_cancel_all,
then waits for local_cnt to drop to zero.

flush-147:0 is woken by drbd_rs_cancel_all, needs to write an AL
transaction, and queues that on the worker.

Deadlock.

Fix: do not wait in the worker, have put_ldev() trigger the
state change D_FAILED -> D_DISKLESS when necessary.
put_ldev() cannot do the state change directly, as it may or may not
already hold various spinlocks. We queue a short work instead.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:50 +02:00
Lars Ellenberg 22cc37a943 drbd: fix unlikely access after free and list corruption
Various cleanup paths have been incomplete, for the very unlikely case
that we cannot allocate enough bios from process context when submitting
on behalf of the peer or resync process.

Never observed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:49 +02:00
Lars Ellenberg af85e8e83d drbd: fix for spurious fullsync (uuids rotated too fast)
If it was an "empty" resync, the SyncSource may have already "finished"
the resync and rotated the UUIDs, before noticing the connection loss
(and generating a new uuid, if Primary, rotating again), while the
SyncTarget did not change its uuids at all, or only got to the previous
sync-uuid.
This would then again lead to a full sync on next handshake
(see also Bug #251).

Fix:
Use explicit resync finished notification even for empty resyncs,
do not finish an empty resync implicitly on the SyncSource.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:48 +02:00
Lars Ellenberg e9ef7bb6f9 drbd: allow for explicit resync-finished notifications
Preparation patch so more drbd_send_state() usage on the peer
will not confuse drbd in receive_state().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:47 +02:00
Lars Ellenberg 4ac4aadacb drbd: preparation commit, using full state in receive_state()
no functional change, just using full state instead of just the .conn
part of it for comparisons.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:46 +02:00
Lars Ellenberg 2b2bf2148f drbd: drbd_send_ack_dp must not rely on header information
drbd commit 17c854fea474a5eb3cfa12e4fb019e46debbc4ec
drbd: receiving of big packets, for payloads between 64kByte and 4GByte
introduced a new on-the-wire packet header format.  We must no longer
assume either format, but use the result of whatever drbd_recv_header
has decoded.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:45 +02:00
Lars Ellenberg 004352fa60 drbd: Fix regression in recv_bm_rle_bits (compressed bitmap)
We used to be16_to_cpu the length field in our received packet header.
drbd commit 17c854fea474a5eb3cfa12e4fb019e46debbc4ec
    drbd: receiving of big packets, for payloads between 64kByte and 4GByte
changed this, but forgot to adjust a few places where we relied on
h->length being in native byte order.

This broke the receiving side of the RLE compressed bitmap exchange.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:44 +02:00
Philipp Reisner f10f262349 drbd: Fixed a stupid copy and paste error
This caused rs_planed to be not in sync with the content of the fifo.
That in turn could cause that the resync comes to a complete halt.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:43 +02:00
Philipp Reisner 00b425377d drbd: Allow larger values for c-fill-target.
Connections through a compressing proxy might have more bits
on the fly. 500MByte instead of 50MByte

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:42 +02:00
Lars Ellenberg f65363cfa0 drbd: fix possible access after free
If we release the page pointed to by md_io_tmpp, we need to zero out the
pointer, too, as that may be used later to decide whether we need to
allocate a new page again.

Impact: a previously freed page may be used and clobbered.  Depending on
what that particular page is being used for meanwhile, this may result
in silent data corruption of completely unrelated things.

Only of concern on devices with logical_block_size != 512 byte,
if you re-attach after becoming diskless once.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:41 +02:00
Lars Ellenberg 8979d9c9e0 drbd: protocol compatibility for maximum packet sizes
Two missing corner cases to the "maximum packet size" handshake.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:41 +02:00
Philipp Reisner fb22c402ff drbd: Track the reasons to suspend IO in dedicated state bits
There are three ways to get IO suspended:

 * Loss of any access to data
 * Fence-peer-handler running
 * User requested to suspend IO

Track those in different bits, so that one condition clearing its
state bit does not interfere with the other two conditions.

Only when the user resumes IO he overrules all three bits.

The fact is hidden from the user, he sees only a single suspend
bit.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:40 +02:00
Lars Ellenberg 78db89287c drbd: DIV_ROUND_UP not needed here
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:39 +02:00
Philipp Reisner 5a75cc7cfb drbd: Fixed compatibility with protocol versions smaller than 95
Forgot to consider the max size for the resync requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:38 +02:00
Lars Ellenberg f2906e183f drbd: fix for spurious full sync (becoming sync target looked like invalidate)
If a synctarget lost connection while being WFSyncUUID,
due to "state sanitizing", the attempted state change to SyncTarget
looked like an "invalidate" to after_state_ch() later,
thus caused a full sync on next handshake (Bug #318).

drbd0: PingAck did not arrive in time.
drbd0: peer( Primary -> Unknown ) conn( WFSyncUUID -> NetworkFailure ) pdsk( UpToDate -> DUnknown )

        from  : { cs:NetworkFailure ro:Secondary/Unknown ds:UpToDate/DUnknown r--- }
        to    : { cs:SyncTarget ro:Secondary/Unknown ds:Inconsistent/DUnknown r--- }
        after sanizising, resulted in
        state: { cs:NetworkFailure ro:Secondary/Unknown ds:Inconsistent/DUnknown r--- }
        drbd0: disk( UpToDate -> Inconsistent )

Fix:
don't mask state transition errors in "sanitizing",
so the requested state change to SyncTarget fails,
instead of being implicitly "remaped" to invalidate.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:37 +02:00
Lars Ellenberg 02bc7174ae drbd: cosmetic, don't report resync for online-verify
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:36 +02:00
Lars Ellenberg a821cc4a9a drbd: fix spurious protocol error
If we cannot satisfy a request (because our disk just broke),
we still need to drain the payload.  Or we'll get a protocol error
when interpreting the payload as DRBD packet header.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:35 +02:00
Lars Ellenberg 1d53f09e17 drbd: fix potential kernel BUG (NULL deref)
BUG trace would look like:
 lc_find
 drbd_rs_complete_io
 got_OVResult
 drbd_asender

Could be triggered by explicit, or IO-error policy based,
detach during online-verify.

We may only dereference mdev->resync, if we first get_ldev(), as the
disk may break any time, causing mdev->resync to disappear once all
ldev references have been returned.
Already in flight online-verify requests or replies may still come in,
which we then need to ignore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:34 +02:00
Lars Ellenberg 435f07402b drbd: don't count sendpage()d pages only referenced by tcp as in use
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:33 +02:00
Philipp Reisner 76d2e7eca8 drbd: Adding support for BIO/Request flags: REQ_FUA, REQ_FLUSH and REQ_DISCARD
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:32 +02:00
Lars Ellenberg 1090c056c5 drbd: drbd_md_sync before calling user space helpers
Just in case we have some pending meta data changes to sync, do it
before we call our userland helper, as that may take some time,
or even cause a hard reboot.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:31 +02:00
Lars Ellenberg ee15b03816 drbd: fix race on meta-data update, addendum
addendum to baa33ae4eaa4477b60af7c434c0ddd1d182c1ae7

The race:
    drbd_md_sync()
	if (!test_and_clear_bit(MD_DIRTY, &mdev->flags))
		return;
    ==> RACE with drbd_md_mark_dirty() rearming the timer.
	del_timer(&mdev->md_sync_timer);

    Fixed by moving the del_timer before the test_and_clear_bit.

Additionally only rearm the timer in drbd_md_mark_dirty, if MD_DIRTY was
not already set, reduce the grace period from five to one second, and
add an ifdef'ed debuging aid to find code paths missing an explicit
drbd_md_sync, if any, as those are the only relevant ones for this race.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:30 +02:00
Philipp Reisner 63106d3c6c drbd: Removed a race that could cause unexpected execution of w_make_resync_request()
The actual race happened int the drbd_start_resync() function. Where
drbd_resync_finished() -> __drbd_set_state() set STOP_SYNC_TIMER and
armed the timer.

If the timer fired before execution reaches the mod_timer statement
at the end of drbd_start_resync() the latter would cause an
unexpected call to w_make_resync_request().

Removed the STOP_SYNC_TIMER bit, and base it on the connection state.

The STOP_SYNC_TIMER bit probably originates probably the time before
the state engine.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:29 +02:00
Lars Ellenberg ef50a3e34f drbd: implicitly create unconfigured devices on sync-after dependencies
If pacemaker (for example) decided to initialize minor devices not in
the exact sync-after dependency order, the configuration partially
failed with an error "The sync-after minor number is invalid". (Bugz. #322)

We can avoid that by implicitly creating unconfigured minor devices,
if others depend on them.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:28 +02:00
Lars Ellenberg 3f3a9b849d drbd: fix race on meta-data update
The race:
	drbd_md_mark_dirty()
	drbd_md_sync()
		if (!test_and_clear_bit(MD_DIRTY, &mdev->flags))
			return;
		drbd_md_sync_page_io(mdev, mdev->ldev, sector, WRITE)
  ==> RACE
		clear_bit(MD_DIRTY, &mdev->flags); <== spurious

Fixed by removing the spurious clear_bit.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:28 +02:00
Lars Ellenberg c518d04fde drbd: fix race between deconfiguring and reconfiguring network
If a drbd_nl_net_conf hits the small window between the state change
to C_STANDALONE and the corresponding cleanup in after_state_ch,
that cleanup would throw away stuff we now need again,
and later trigger BUG_ON()s.

Fixed by properly serializing the new config request with
any pending cleanup.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:27 +02:00
Philipp Reisner 0778286a13 drbd: Disable activity log updates when the whole device is out of sync
When the complete device is marked as out of sync, we can disable
updates of the on disk AL. Currently AL updates are only disabled
if one uses the "invalidate-remote" command on an unconnected,
primary device, or when at attach time all bits in the bitmap are
set.

As of now, AL updated do not get disabled when a all bits becomes
set due to application writes to an unconnected DRBD device.
While this is a missing feature, it is not considered important,
and might get added later.

BTW, after initializing a "one legged" DRBD device
drbdadm create-md resX
drbdadm -- --force primary resX
AL updates also get disabled, until the first connect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:26 +02:00
Philipp Reisner d53733893d drbd: Actually allow BIOs up to 128k (was 32k).
Now we have multiple BIOs per ee, packets with a 32 bit length field,
it gets time to use these goodies.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:25 +02:00
Philipp Reisner 02918be227 drbd: receiving of big packets, for payloads between 64kByte and 4GByte
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:24 +02:00
Philipp Reisner 0b70a13dac drbd: Sending of big packets, for payloads from 64KByte to 4GByte
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:23 +02:00
Philipp Reisner 204bba9965 drbd: Bugfix for regression introduced with f9bc8913c06022e
If we intent to use the block_id member of an epoch entry,
we may not use the digest member.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:22 +02:00
Philipp Reisner 48acf86898 drbd: Microfix: Assigning sector once is sufficient
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:21 +02:00
Lars Ellenberg 0f0601f4ea drbd: new configuration parameter c-min-rate
We now track the data rate of locally submitted resync related requests,
and can thus detect non-resync activity on the lower level device.

If the current sync rate is above c-min-rate, and the lower level device
appears to be busy, we throttle the resyncer.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:20 +02:00
Lars Ellenberg 80a40e439e drbd: reduce code duplication when receiving data requests
also canonicalize the return values of read_for_csum
and drbd_rs_begin_io to return -ESOMETHING, or 0 for success.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:19 +02:00
Lars Ellenberg 1d7734a0df drbd: use rolling marks for resync speed calculation
The current resync speed as displayed in /proc/drbd fluctuates a lot.
Using an array of rolling marks makes this calculation much more stable.
We used to have this (a long time ago with 0.7), but it got lost somehow.

If "stalled", do not discard the rest of the information, just add a
" (stalled)" tag to the progress line.

This patch also shortens a spinlock critical section somewhat, and
reduces the number of atomic operations in put_ldev.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:18 +02:00
Lars Ellenberg 0bb70bf601 drbd: remove outdated comment and dead code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:17 +02:00
Lars Ellenberg c36c3ced69 drbd: let drbd_free_ee implicitly free any digest
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:16 +02:00
Philipp Reisner 85719573dd drbd: Replaced some casts by an union. Improved comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:15 +02:00
Philipp Reisner d207450cf2 drbd: Bugfix: rs_in_flight could become wrong if read_for_csum() requested reschedule later
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:14 +02:00
Philipp Reisner 778f271dfe drbd: The new, smarter resync speed controller
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:14 +02:00
Philipp Reisner 8e26f9ccb9 drbd: New sync_param packet, that includes the parameters of the new controller
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:13 +02:00
Philipp Reisner 9a31d7164d drbd: New sync parameters for the smart resync rate controller
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:38:12 +02:00
Lars Ellenberg d28fd092a5 drbd: fix list corruption (recent regression)
The commit 288f422ec1
 drbd: Track all IO requests on the TL, not writes only
moved a list_add_tail(req, ) into a region where req
may have just been freed due to conflict detection.

Fix this by adding a proper cleanup section for that code path.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 18:31:43 +02:00
Philipp Reisner e756414f7d drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:12:07 +02:00
Philipp Reisner 6709893059 drbd: Make sure tl_restart(, resend) can not get called multiple times for a new connection
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:09:09 +02:00
Philipp Reisner f70b351159 drbd: Do not try to free tl_hash in drbd_disconnect() when IO is suspended
We may not free tl_hash when IO is suspended, since we can not wait
until ap_bio_cnt reaches zero.

We can do this after susp reched 0, since then tl_clear was called

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:08:27 +02:00
Philipp Reisner 8f488156c0 drbd: Allow attach while IO is suspended
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:05:32 +02:00
Philipp Reisner cfa03415a1 drbd: Allow tl_restart() to do IO completion while IO is suspended
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:05:08 +02:00
Philipp Reisner 84dfb9f564 drbd: Fixed a deadlock, probably only affected UP machines
After disconnect (most likely mdev->net_cnt == 0) and we are
still in an unstable state (!drbd_state_is_stable()). When we
get an IO request in drbd_get_max_buffers() (called from
__inc_ap_bio_cond(), called from inc_ap_bio()) we wake up
misc_wait. Misc_wait is also used in inc_ap_bio() to sleep
until the outcome of __inc_ap_bio_cond() changes. => Busy loop!

Solution: Have a dedicated wait queue for get_net_conf() and
put_net_conf().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:04:46 +02:00
Philipp Reisner 65d922c33e drbd: Do not do a hard state change when establishing a connection [bugz 304]
Make sure the state engine can deny two primaries to connect

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:04:10 +02:00
Philipp Reisner 481c6f5032 drbd: Ensure that the peer was not rebootet in the meantime before resending TL
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 15:01:37 +02:00
Philipp Reisner 43a5182ccc drbd: Delayed creation of current-UUID
When a fencing policy of "resource-and-stonith" is configured,
and DRBD looses connection to it's peer, we can delay the
creation of a new current-UUID until IO gets thawed.

That allows one to deploy fence-peer handlers that actually
commit suicide on the machine they get started.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:59:21 +02:00
Philipp Reisner 87f7be4cf8 drbd: Run the fence-peer helper asynchronously
Since we can not thaw the transfer log, the next logical step is
to allow reconnects while the fence-peer handler runs.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:58:36 +02:00
Philipp Reisner 1616a25493 drbd: Reduce the verbosity of some state transitions
State transitions in the space of non-allowed states used
to be very noisy. Reduce that, since that has little value
for the majority of the user base.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:57:22 +02:00
Philipp Reisner 999122bc18 drbd: Removing a by now obsolete clause in the state sanitizing
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:56:50 +02:00
Philipp Reisner 18a50fa213 drbd: Now we need to handle the ed_uuid of an diskless, unconnected primary correctly
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:56:00 +02:00
Philipp Reisner 894c6a9461 drbd: Disabled the crashed_primary detection for re-attach of last data while IO is frozen
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:55:11 +02:00
Philipp Reisner 47ff2d0a8e drbd: Do not allow a fencing-policy of resource-and-stonith with protocol A
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:53:42 +02:00
Philipp Reisner 265be2d098 drbd: Finished the "on-no-data-accessible suspend-io;" functionality
When no data is accessible (no connection to the peer, nor a local disk)
allow the user to select to freeze all IO operations instead of getting
IO errors.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:52:53 +02:00
Philipp Reisner 905cd7d8ac drbd: Removed redundant error checks in the request code path
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:39:38 +02:00
Philipp Reisner 5ba82308ea drbd: factored drbd_req_make_private_bio() out of drbd_req_new()
Preparing tl_thaw_dio()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:37:33 +02:00
Philipp Reisner b9b98716f8 drbd: Do not send two barriers without any writes between them
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:36:51 +02:00
Philipp Reisner 11b58e73a3 drbd: factored tl_restart() out of tl_clear().
If IO was frozen for a temporal network outage, resend the
content of the transfer-log into the newly established connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:35:58 +02:00
Philipp Reisner 2a80699f80 drbd: mod_req has now a return value
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:26:45 +02:00
Philipp Reisner 288f422ec1 drbd: Track all IO requests on the TL, not writes only
With that the drbd_fail_pending_reads() function becomes obsolete.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:25:20 +02:00
Philipp Reisner 7e602c0aaf drbd: renamed drbd_tl_epoch.n_req to drbd_tl_epoch.n_writes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14 14:23:45 +02:00
Dan Carpenter 93055c3104 ps3disk: passing wrong variable to bvec_kunmap_irq()
This should pass "buf" to bvec_kunmap_irq() instead of "bv".  The api is
like kmap_atomic() instead of kmap().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-12 18:56:33 +02:00
Mike Snitzer e4c4776dea virtio-blk: fix request leak.
Must drop reference taken by blk_make_request().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org # .35.x
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-09 11:42:37 -07:00
Arnd Bergmann 2a48fc0ab2 block: autoconvert trivial BKL users to private mutex
The block device drivers have all gained new lock_kernel
calls from a recent pushdown, and some of the drivers
were already using the BKL before.

This turns the BKL into a set of per-driver mutexes.
Still need to check whether this is safe to do.

file=$1
name=$2
if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
            sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
    else
            sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
    fi
    sed -i ${file} \
        -e "/^#include.*linux.mutex.h/,$ {
                1,/^\(static\|int\|long\)/ {
                     /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

} }"  \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[      ]*cycle_kernel_lock();/d'
else
    sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                -e '/cycle_kernel_lock()/d'
fi

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-05 15:01:10 +02:00
Arnd Bergmann 613655fa39 drivers: autoconvert trivial BKL users to private mutex
All these files use the big kernel lock in a trivial
way to serialize their private file operations,
typically resulting from an earlier semi-automatic
pushdown from VFS.

None of these drivers appears to want to lock against
other code, and they all use the BKL as the top-level
lock in their file operations, meaning that there
is no lock-order inversion problem.

Consequently, we can remove the BKL completely,
replacing it with a per-file mutex in every case.
Using a scripted approach means we can avoid
typos.

These drivers do not seem to be under active
maintainance from my brief investigation. Apologies
to those maintainers that I have missed.

file=$1
name=$2
if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
            sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
    else
            sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
    fi
    sed -i ${file} \
        -e "/^#include.*linux.mutex.h/,$ {
                1,/^\(static\|int\|long\)/ {
                     /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

} }"  \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[      ]*cycle_kernel_lock();/d'
else
    sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                -e '/cycle_kernel_lock()/d'
fi

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-05 15:01:04 +02:00
Dan Rosenberg 252a52aa4f Fix pktcdvd ioctl dev_minor range check
The PKT_CTRL_CMD_STATUS device ioctl retrieves a pointer to a
pktcdvd_device from the global pkt_devs array.  The index into this
array is provided directly by the user and is a signed integer, so the
comparison to ensure that it falls within the bounds of this array will
fail when provided with a negative index.

This can be used to read arbitrary kernel memory or cause a crash due to
an invalid pointer dereference.  This can be exploited by users with
permission to open /dev/pktcdvd/control (on many distributions, this is
readable by group "cdrom").

Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
[ Rather than add a cast, just make the function take the right type -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-27 16:29:06 -07:00
Vivek Goyal 504c6d1b44 amiga floppy: Compile failure fixes
o Compile fixes for amiga floppy driver.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-26 12:23:25 +09:00
Vivek Goyal 639e2f2aa7 atari floppy: Stop sharing request queue across multiple gendisks
o Use one request queue per gendisk instead of sharing the queue.

o Don't have hardware. No compile testing or run time testing done. Completely
  untested.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-24 20:35:45 +02:00
Vivek Goyal 786029ff81 amiga floppy: Stop sharing request queue across multiple gendisks
o Use one request queue per gendisk instead of sharing request queue

o Don't have hardware. No compile testing or run time testing done. Completely
  untested.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-24 20:35:44 +02:00
Jens Axboe 488211844e floppy: switch to one queue per drive instead of sharing a queue
Pretty straight forward conversion. Note that we do round-robin
between the drives that have available requests, before we simply
used the drive that the IO scheduler told us to. Since the IO
scheduler doesn't care about multiple devices per queue, the resulting
sort would not have made sense.

Fixed by Vivek to get rid of a double lock problem in set_next_request()

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
2010-09-22 09:32:36 +02:00
Dan Carpenter b0722cb1ac cciss: freeing uninitialized data on error path
The "h->scatter_list" is allocated inside a for loop.  If any of those
allocations fail, then the rest of the list is uninitialized data.  When
we free it we should start from the top and free backwards so that we
don't call kfree() on uninitialized pointers.

Also if the allocation for "h->scatter_list" fails then we would get an
Oops here.  I should have noticed this when I send: 4ee69851c "cciss:
handle allocation failure."  but I didn't.  Sorry about that.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-21 11:49:17 +02:00
Christoph Hellwig dd3932eddf block: remove BLKDEV_IFL_WAIT
All the blkdev_issue_* helpers can only sanely be used for synchronous
caller.  To issue cache flushes or barriers asynchronously the caller needs
to set up a bio by itself with a completion callback to move the asynchronous
state machine ahead.  So drop the BLKDEV_IFL_WAIT flag that is always
specified when calling blkdev_issue_* and also remove the now unused flags
argument to blkdev_issue_flush and blkdev_issue_zeroout.  For
blkdev_issue_discard we need to keep it for the secure discard flag, which
gains a more descriptive name and loses the bitops vs flag confusion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-16 20:52:58 +02:00
Martin K. Petersen c8bf133682 Consolidate min_not_zero
We have several users of min_not_zero, each of them using their own
definition.  Move the define to kernel.h.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2010-09-10 20:07:38 +02:00
Linus Torvalds ff3cb3fec3 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Range check cpu in blk_cpu_to_group
  scatterlist: prevent invalid free when alloc fails
  writeback: Fix lost wake-up shutting down writeback thread
  writeback: do not lose wakeup events when forking bdi threads
  cciss: fix reporting of max queue depth since init
  block: switch s390 tape_block and mg_disk to elevator_change()
  block: add function call to switch the IO scheduler from a driver
  fs/bio-integrity.c: return -ENOMEM on kmalloc failure
  bio-integrity.c: remove dependency on __GFP_NOFAIL
  BLOCK: fix bio.bi_rw handling
  block: put dev->kobj in blk_register_queue fail path
  cciss: handle allocation failure
  cfq-iosched: Documentation help for new tunables
  cfq-iosched: blktrace print per slice sector stats
  cfq-iosched: Implement tunable group_idle
  cfq-iosched: Do group share accounting in IOPS when slice_idle=0
  cfq-iosched: Do not idle if slice_idle=0
  cciss: disable doorbell reset on reset_devices
  blkio: Fix return code for mkdir calls
2010-09-10 07:26:27 -07:00
Tejun Heo 02c42b7a68 virtio_blk: drop REQ_HARDBARRIER support
Remove now unused REQ_HARDBARRIER support.  virtio_blk already
supports REQ_FLUSH and the usefulness of REQ_FUA for virtio_blk is
questionable at this point, so there's nothing else to do to support
new REQ_FLUSH/FUA interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00
Tejun Heo 6259f28459 block/loop: implement REQ_FLUSH/FUA support
Deprecate REQ_HARDBARRIER and implement REQ_FLUSH/FUA instead.  Also,
instead of checking file->f_op->fsync() directly, look at the value of
vfs_fsync() and ignore -EINVAL return.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00
Tejun Heo 9cbbdca44a block: remove spurious uses of REQ_HARDBARRIER
REQ_HARDBARRIER is deprecated.  Remove spurious uses in the following
users.  Please note that other than osdblk, all other uses were
already spurious before deprecation.

* osdblk: osdblk_rq_fn() won't receive any request with
  REQ_HARDBARRIER set.  Remove the test for it.

* pktcdvd: use of REQ_HARDBARRIER in pkt_generic_packet() doesn't mean
  anything.  Removed.

* aic7xxx_old: Setting MSG_ORDERED_Q_TAG on REQ_HARDBARRIER is
  spurious.  Removed.

* sas_scsi_host: Setting TASK_ATTR_ORDERED on REQ_HARDBARRIER is
  spurious.  Removed.

* scsi_tcq: The ordered tag path wasn't being used anyway.  Removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: James Bottomley <James.Bottomley@suse.de>
Cc: Peter Osterlund <petero2@telia.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Tejun Heo 4913efe456 block: deprecate barrier and replace blk_queue_ordered() with blk_queue_flush()
Barrier is deemed too heavy and will soon be replaced by FLUSH/FUA
requests.  Deprecate barrier.  All REQ_HARDBARRIERs are failed with
-EOPNOTSUPP and blk_queue_ordered() is replaced with simpler
blk_queue_flush().

blk_queue_flush() takes combinations of REQ_FLUSH and FUA.  If a
device has write cache and can flush it, it should set REQ_FLUSH.  If
the device can handle FUA writes, it should also set REQ_FUA.

All blk_queue_ordered() users are converted.

* ORDERED_DRAIN is mapped to 0 which is the default value.
* ORDERED_DRAIN_FLUSH is mapped to REQ_FLUSH.
* ORDERED_DRAIN_FLUSH_FUA is mapped to REQ_FLUSH | REQ_FUA.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Tejun Heo 6958f14545 block: kill QUEUE_ORDERED_BY_TAG
Nobody is making meaningful use of ORDERED_BY_TAG now and queue
draining for barrier requests will be removed soon which will render
the advantage of tag ordering moot.  Kill ORDERED_BY_TAG.  The
following users are affected.

* brd: converted to ORDERED_DRAIN.
* virtio_blk: ORDERED_TAG path was already marked deprecated.  Removed.
* xen-blkfront: ORDERED_TAG case dropped.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Tejun Heo 589d7ed02a block/loop: queue ordered mode should be DRAIN_FLUSH
loop implements FLUSH using fsync but was incorrectly setting its
ordered mode to DRAIN.  Change it to DRAIN_FLUSH.  In practice, this
doesn't change anything as loop doesn't make use of the block layer
ordered implementation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:36 +02:00
Stephen M. Cameron fcfb5c0ce1 cciss: remove some superfluous tests from cciss_bigpassthru()
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:40 +02:00
Stephen M. Cameron 0c9f5ba7cb cciss: factor out cciss_big_passthru
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:39 +02:00
Stephen M. Cameron f32f125b1c cciss: factor out cciss_passthru
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:37 +02:00
Stephen M. Cameron 0894b32c5c cciss: factor out cciss_getluninfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:36 +02:00
Stephen M. Cameron c525919ddf cciss: factor out cciss_getdrivver
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:35 +02:00
Stephen M. Cameron 8a4f7fbfdd cciss: factor out cciss_getfirmver
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:34 +02:00
Stephen M. Cameron d18dfad4e2 cciss: factor out cciss_getbustypes
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:33 +02:00
Stephen M. Cameron 93c7493113 cciss: factor out cciss_getheartbeat
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:32 +02:00
Stephen M. Cameron 4f43f32cd3 cciss: factor out cciss_setnodename
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:32 +02:00
Stephen M. Cameron 2521610942 cciss: factor out cciss_getnodename
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:31 +02:00
Stephen M. Cameron 4c800eed9a cciss: factor out cciss_setintinfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:30 +02:00
Stephen M. Cameron 576e661c65 cciss: factor out cciss_getintinfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:29 +02:00
Stephen M. Cameron 0a25a5aee7 cciss: factor out cciss_getpciinfo
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:12:28 +02:00
Stephen M. Cameron 2a643ec67f cciss: fix reporting of max queue depth since init
The ioctl path and the scsi tape path were not accounting
for their additions to the queue depth.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-25 19:58:53 +02:00
Linus Torvalds c05e1e23b8 Merge branch 'for-upstream/pvhvm' of git://xenbits.xensource.com/people/ianc/linux-2.6
* 'for-upstream/pvhvm' of git://xenbits.xensource.com/people/ianc/linux-2.6:
  xen: pvhvm: make it clearer that XEN_UNPLUG_* define bits in a bitfield
  xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary
  xen: pvhvm: allow user to request no emulated device unplug
2010-08-23 18:29:18 -07:00
Milan Broz ee86273062 loop: add some basic read-only sysfs attributes
Create /sys/block/loopX/loop directory and provide these attributes:
 - backing_file
 - autoclear
 - offset
 - sizelimit

This loop directory is present only if loop device is configured.

To be used in util-linux-ng (and possibly elsewhere like udev rules)
where code need to get loop attributes from kernel (and not store
duplicate info in userspace).

Moreover loop ioctls are not even able to provide full backing
file info because of buffer limits.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 15:18:10 +02:00
Jens Axboe 52cc2eef31 block: switch s390 tape_block and mg_disk to elevator_change()
Now that we have this API, switch the two in-kernel users to it.
Resolves an oops introduced by commit
1abec4fdbb.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 14:02:44 +02:00
Ian Campbell 1dc7ce99b0 xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary
It is not immediately clear what this option causes to become
ignored. The actual meaning is that it is not necessary to unplug the
emulated devices to safely use the PV ones, even if the platform does
not support the unplug protocol. (pressumably the user will only add
this option if they have ensured that their domain configuration is
safe).

I think xen_emul_unplug=unnecessary better captures this.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
2010-08-23 11:59:29 +01:00
Jiri Slaby 5e00d1b5b4 BLOCK: fix bio.bi_rw handling
Return of the bi_rw tests is no longer bool after commit 74450be1. But
results of such tests are stored in bools. This doesn't fit in there
for some compilers (gcc 4.5 here), so either use !! magic to get real
bools or use ulong where the result is assigned somewhere.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 12:33:10 +02:00
Dan Carpenter 4ee69851cd cciss: handle allocation failure
If kmalloc() fails then cleanup and return failure (-1).

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 12:28:15 +02:00
Stephen M. Cameron 75230ff275 cciss: disable doorbell reset on reset_devices
The doorbell reset initially appears to work correctly,
the controller resets, comes up, some i/o can even be
done, but on at least some Smart Arrays in some servers,
it eventually causes a subsequent controller lockup due
to some kind of PCIe error, and kdump can end up leaving
the root filesystem in an unbootable state.  For this
reason, until the problem is fixed, or at least isolated
to certain hardware enough to be avoided, the doorbell
reset should not be used at all.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 11:02:17 +02:00
Graeme Smecher 7a50d06e24 of: fix missing headers for of_address_to_resource() in MTD and SysACE drivers
The drivers for Xilinx' SystemACE and physically mapped MTDs were missing
prototypes for of_address_to_resource(). This patch adds the necessary
headers.

Signed-off-by: Graeme Smecher <graeme.smecher@mail.mcgill.ca>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-17 13:16:47 -06:00
Linus Torvalds 58d4ea65b9 Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
  mmc_spi: Fix unterminated of_match_table
  of/sparc: fix build regression from of_device changes
  of/device: Replace struct of_device with struct platform_device
2010-08-12 09:11:31 -07:00
Linus Torvalds 2f9e825d3e Merge branch 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block: (149 commits)
  block: make sure that REQ_* types are seen even with CONFIG_BLOCK=n
  xen-blkfront: fix missing out label
  blkdev: fix blkdev_issue_zeroout return value
  block: update request stacking methods to support discards
  block: fix missing export of blk_types.h
  writeback: fix bad _bh spinlock nesting
  drbd: revert "delay probes", feature is being re-implemented differently
  drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
  drbd: Disable delay probes for the upcomming release
  writeback: cleanup bdi_register
  writeback: add new tracepoints
  writeback: remove unnecessary init_timer call
  writeback: optimize periodic bdi thread wakeups
  writeback: prevent unnecessary bdi threads wakeups
  writeback: move bdi threads exiting logic to the forker thread
  writeback: restructure bdi forker loop a little
  writeback: move last_active to bdi
  writeback: do not remove bdi from bdi_list
  writeback: simplify bdi code a little
  writeback: do not lose wake-ups in bdi threads
  ...

Fixed up pretty trivial conflicts in drivers/block/virtio_blk.c and
drivers/scsi/scsi_error.c as per Jens.
2010-08-10 15:22:42 -07:00
Jens Axboe a4cc14ec9f xen-blkfront: fix missing out label
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-08 21:50:05 -04:00
Lars Ellenberg e7f52dfb4f drbd: revert "delay probes", feature is being re-implemented differently
It was a now abandoned attempt to throttle resync bandwidth
based on the delay it causes on the bulk data socket.
It has no userbase yet, and has been disabled by
9173465ccb51c09cc3102a10af93e9f469a0af6f already.
This removes the now unused code.

The basic feature, namely using up "idle" bandwith
of network and disk IO subsystem, with minimal impact
to application IO, is being reimplemented differently.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:57 +02:00
Philipp Reisner 85f4cc17a6 drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:57 +02:00
Philipp Reisner 6710a57603 drbd: Disable delay probes for the upcomming release
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:57 +02:00
Kulikov Vasiliy f6c4c8e19a cpqarray: check put_user() result
put_user() may fail, if so return -EFAULT.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:53:03 +02:00
Jeremy Fitzhardinge 7901d14144 xen/blkfront: Use QUEUE_ORDERED_DRAIN for old backends
If there's no feature-barrier key in xenstore, then it means its a fairly
old backend which does uncached in-order writes, which means ORDERED_DRAIN
is appropriate.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:52:53 +02:00
Jeremy Fitzhardinge 4dab46ff26 xen/blkfront: use tagged queuing for barriers
When barriers are supported, then use QUEUE_ORDERED_TAG to tell the block
subsystem that it doesn't need to do anything else with the barriers.
Previously we used ORDERED_DRAIN which caused the block subsystem to
drain all pending IO before submitting the barrier, which would be
very expensive.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:52:53 +02:00
Stephen Hemminger 3b06c21e84 floppy: make controller const
The struct cont_t is just a set of virtual function pointers.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:31 +02:00
Julia Lawall ad96a7a7ea drivers/block: use memdup_user
Use memdup_user when user data is immediately copied into the
allocated region.  Some checkpatch cleanups in nearby code.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression from,to,size,flag;
position p;
identifier l1,l2;
@@

-  to = \(kmalloc@p\|kzalloc@p\)(size,flag);
+  to = memdup_user(from,size);
   if (
-      to==NULL
+      IS_ERR(to)
                 || ...) {
   <+... when != goto l1;
-  -ENOMEM
+  PTR_ERR(to)
   ...+>
   }
-  if (copy_from_user(to, from, size) != 0) {
-    <+... when != goto l2;
-    -EFAULT
-    ...+>
-  }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Chirag Kantharia <chirag.kantharia@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:31 +02:00
Stephen M. Cameron 8112586063 cciss: cleanup interrupt_not_for_us
cciss: cleanup interrupt_not_for_us
In the case of MSI/MSIX interrutps, we don't need to check
if the interrupt is for us, and in the case of the intx interrupt
handler, when checking if the interrupt is for us, we don't need
to check if we're using MSI/MSIX, we know we're not.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron b2a4a43dba cciss: change printks to dev_warn, etc.
cciss: change printks to dev_warn, etc.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron 6b4d96b878 cciss: separate cmd_alloc() and cmd_special_alloc()
cciss: separate cmd_alloc() and cmd_special_alloc()
cmd_alloc() took a parameter which caused it to either allocate
from a pre-allocated pool, or allocate using pci_alloc_consistent.
This parameter is always known at compile time, so this would
be better handled by breaking the function into two functions
and differentiating the cases by function names.  Same goes
for cmd_free().

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron f70dba8366 cciss: use consistent variable names
cciss: use consistent variable names
"h", for the hba structure and "c" for the command structures.
and get rid of trivial CCISS_LOCK macro.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron 058a0f9f31 cciss: forbid hard reset of 640x boards
cciss: forbid hard reset of 640x boards
The 6402/6404 are two PCI devices -- two Smart Array controllers
-- that fit into one slot.  It is possible to reset them independently,
however, they share a battery backed cache module.  One of the pair
controls the cache and the 2nd one access the cache through the first
one.  If you reset the one controlling the cache, the other one will
not be a happy camper.  So we just forbid resetting this conjoined
mess.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron adfbc1ff34 cciss: sanitize max commands
cciss: sanitize max commands
Some controllers might try to tell us they support 0 commands
in performant mode.  This is a lie told by buggy firmware.
We have to be wary of this lest we try to allocate a negative
number of command blocks, which will be treated as unsigned,
and get an out of memory condition.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron a6528d0172 cciss: fix hard reset code.
cciss: Fix hard reset code.
Smart Array controllers newer than the P600 do not honor the
PCI power state method of resetting the controllers.  Instead,
in these cases we can get them to reset via the "doorbell" register.

This escaped notice until we began using "performant" mode because
the fact that the controllers did not reset did not normally
impede subsequent operation, and so things generally appeared to
"work".  Once the performant mode code was added, if the controller
does not reset, it remains in performant mode.  The code immediately
after the reset presumes the controller is in "simple" mode
(which previously, it had remained in simple mode the whole time).
If the controller remains in performant mode any code which presumes
it is in simple mode will not work.  So the reset needs to be fixed.

Unfortunately there are some controllers which cannot be reset by
either method. (eg. p800).  We detect these cases by noticing that
the controller seems to remain in performant mode even after a
reset has been attempted.  In those cases we ignore the controller,
as any commands outstanding on it will result in stale completions.
To sum up, we try to do a better job of resetting the controller if
"reset_devices" is set, and if it doesn't work, we ignore that
controller.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:30 +02:00
Stephen M. Cameron 83123cb11b cciss: factor out cciss_reset_devices()
cciss: factor out cciss_reset_devices()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:12 +02:00
Stephen M. Cameron 8e93bf6d6c cciss: factor out cciss_find_cfg_addrs.
Rationale for this is that I will also need to use this code
in fixing kdump host reset code prior to having the hba structure.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:12 +02:00
Stephen M. Cameron b993313540 cciss: factor out cciss_enter_performant_mode
cciss: factor out cciss_enter_performant_mode

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:12 +02:00
Stephen M. Cameron 0f8a6a1e7b cciss: factor out cciss_wait_for_mode_change_ack()
cciss: factor out cciss_wait_for_mode_change_ack()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron fe3b7527db cciss: make cciss_put_controller_into_performant_mode as __devinit
cciss: make cciss_put_controller_into_performant_mode as __devinit

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron ff5f58f06d cciss: cleanup some debug ifdefs
cciss: cleanup some debug ifdefs

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron bfd63ee571 cciss: factor out cciss_p600_dma_prefetch_quirk()
cciss: factor out cciss_p600_dma_prefetch_quirk()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron 322e304c4d cciss: factor out cciss_enable_scsi_prefetch()
cciss: factor out cciss_enable_scsi_prefetch()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron 501b92cd6b cciss: factor out CISS_signature_present()
cciss: factor out CISS_signature_present()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron afadbf4b95 cciss: factor out cciss_find_board_params
cciss: factor out cciss_find_board_params

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron da5503217d cciss: fix leak of ioremapped memory
cciss: fix leak of ioremapped memory
in cciss_pci_init error path.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron 4809d0988f cciss: factor out cciss_find_cfgtables
cciss: factor out cciss_find_cfgtables

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron e99ba13627 cciss: factor out cciss_wait_for_board_ready()
cciss: factor out cciss_wait_for_board_ready()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:11 +02:00
Stephen M. Cameron d474830da6 cciss: factor out cciss_find_memory_BAR()
cciss: factor out cciss_find_memory_BAR()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron dac5488a9e cciss: remove board_id parameter from cciss_interrupt_mode()
cciss: remove board_id parameter from cciss_interrupt_mode()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron dd9c426e92 cciss: factor out cciss_board_disabled
cciss: factor out cciss_board_disabled

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron 6539fa9b2e cciss: factor out cciss_lookup_board_id
cciss: factor out cciss_lookup_board_id

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron 292e50dd39 cciss: save pdev pointer in per hba structure early to avoid passing it around so much.
cciss: save pdev pointer in per hba structure early to avoid passing it around so much.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Stephen M. Cameron 373b45f7b6 cciss: Set the performant mode bit in the scsi half of the driver
cciss: Set the performant mode bit in the scsi half of the driver
In a couple of places, the performant mode bit wasn't being set in
the scsi half of the driver, causing commands to seem to hang.  Use
enqueue_cmd_and_start_io() where appropriate.  This fixes a bug that

	echo engage scsi > /proc/driver/cciss/cciss0

would hang.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:52:10 +02:00
Daniel Stodden d54142c71f blkfront: Klog the unclean release path
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:51:21 +02:00
Daniel Stodden 7b32d1044a blkfront: Remove obsolete info->users
This is just bd_openers, protected by the bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:49:20 +02:00
Daniel Stodden acfca3c622 blkfront: Remove obsolete info->users
This is just bd_openers, protected by the bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:47:26 +02:00
Daniel Stodden fa1bd3591a blkfront: Lock blockfront_info during xbdev removal
Same approach as blkfront_closing:
 * Grab the bdev safely, holding the info mutex.
 * Zap xbdev safely, holding the info mutex.
 * Try bdev removal safely, holding bd_mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:45:27 +02:00
Daniel Stodden 7fd152f4b6 blkfront: Fix blkfront backend switch race (bdev release)
We cannot read backend state within bdev operations, because it risks
grabbing the state change before xenbus gets to do it.

Fixed by tracking deferral with a frontend switch to Closing. State
exposure isn't strictly necessary, but the backends won't mind.

For a 'clean' deferral this seems actually a more decent protocol than
raising errors.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:45:12 +02:00
Daniel Stodden 139617437a blkfront: Fix blkfront backend switch race (bdev open)
We need not mind if users grab a late handle on a closing disk. We
probably even should not. But we have to make sure it's not a dead
one already

Let the bdev deal with a gendisk deleted under its feet. Takes the
info mutex to decide a race against backend closing.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:38:43 +02:00
Daniel Stodden b70f5fa043 blkfront: Lock blkfront_info when closing
The bdev .open/.release fops race against backend switches to Closing,
handled by the XenBus thread.

The original code attempted to serialize block device holders and
xenbus only via bd_mutex. This is insufficient, the info->bd pointer
may already be stale (or null) while xenbus tries to bump up the
refcount.

Protect blkfront_info with a dedicated mutex.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:38:43 +02:00
Daniel Stodden a66b5aebb7 blkfront: Clean up vbd release
* Current blkfront_closing is rather a xlvbd_release_gendisk.
   Renamed in preparation of later patches (need the name again).

 * Removed the misleading comment -- this only applied to the backend
   switch handler, and the queue is already flushed btw.

 * Break out the xenbus call, callers know better when to switch
   frontend state.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:38:43 +02:00
Daniel Stodden 9897cb5323 blkfront: Fix gendisk leak
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:37 +02:00
Daniel Stodden 89de1669ac blkfront: Fix backtrace in del_gendisk
The call to del_gendisk follows an non-refcounted gd->queue
pointer. We release the last ref in blk_cleanup_queue. Fixed by
reordering releases accordingly.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:35 +02:00
K. Y. Srinivasan 2def141e71 xen/blkfront: revalidate after setting capacity
Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:31 +02:00
Jeremy Fitzhardinge b4dddb498c xen/blkfront: avoid compiler warning from missing cases
Fix:
drivers/block/xen-blkfront.c: In function ‘blkfront_connect’:
drivers/block/xen-blkfront.c:933: warning: enumeration value ‘BLKIF_STATE_DISCONNECTED’ not handled in switch

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:29 +02:00
K. Y. Srinivasan 1fa73be6be xen/front: Propagate changed size of VBDs
Support dynamic resizing of virtual block devices. This patch supports
both file backed block devices as well as physical devices that can be
dynamically resized on the host side.

Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:27 +02:00
Jan Beulich 5d7ed20e82 blkfront: don't access freed struct xenbus_device
Unfortunately commit "blkfront: fixes for 'xm block-detach ... --force'"
still wasn't quite right - there was a reference to freed memory left
from blkfront_closing().

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:31:12 +02:00
Jan Beulich 0e34582699 blkfront: fixes for 'xm block-detach ... --force'
Prevent prematurely freeing 'struct blkfront_info' instances (when the
xenbus data structures are gone, but the Linux ones are still needed).

Prevent adding a disk with the same (major, minor) [and hence the same
name and sysfs entries, which leads to oopses] when the previous
instance wasn't fully de-allocated yet.

This still doesn't address all issues resulting from forced detach:
I/O submitted after the detach still blocks forever, likely preventing
subsequent un-mounting from completing. It's not clear to me (not
knowing much about the block layer) how this can be avoided.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:28:55 +02:00
Ian Campbell 203fd61f42 xen: use less generic names in blkfront driver.
All Xen frontend drivers have a couple of identically named functions which
makes figuring out which device went wrong from a stacktrace harder than it
needs to be. Rename them to something specificto the device type.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:26:39 +02:00
Arnd Bergmann 6e9624b8ca block: push down BKL into .open and .release
The open and release block_device_operations are currently
called with the BKL held. In order to change that, we must
first make sure that all drivers that currently rely
on this have no regressions.

This blindly pushes the BKL into all .open and .release
operations for all block drivers to prepare for the
next step. The drivers can subsequently replace the BKL
with their own locks or remove it completely when it can
be shown that it is not needed.

The functions blkdev_get and blkdev_put are the only
remaining users of the big kernel lock in the block
layer, besides a few uses in the ioctl code, none
of which need to serialize with blkdev_{get,put}.

Most of these two functions is also under the protection
of bdev->bd_mutex, including the actual calls to
->open and ->release, and the common code does not
access any global data structures that need the BKL.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:25:34 +02:00
Arnd Bergmann 8a6cfeb6de block: push down BKL into .locked_ioctl
As a preparation for the removal of the big kernel
lock in the block layer, this removes the BKL
from the common ioctl handling code, moving it
into every single driver still using it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:25:00 +02:00
FUJITA Tomonori 00fff26539 block: remove q->prepare_flush_fn completely
This removes q->prepare_flush_fn completely (changes the
blk_queue_ordered API).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:15 +02:00
FUJITA Tomonori dd40e456a4 virtio_blk: stop using q->prepare_flush_fn
use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:14 +02:00
FUJITA Tomonori 98d8c8f40e ps3disk: stop using q->prepare_flush_fn
REQ_FLUSH flag enables us to kill ps3disk_prepare_flush().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:03 +02:00
FUJITA Tomonori 7f9815f09d osdblk: stop using q->prepare_flush_fn
use REQ_FLUSH flag instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:24:00 +02:00
Randy Dunlap 511d37af66 block/xd.c: fix brace typo
Fix extra brace typo that is causing build errors.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:23:14 +02:00
Christoph Hellwig 4c4762d10f block: fix some more cmd_type cleanup fallout
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:22:29 +02:00
Jens Axboe 15fa6e8165 virtio_blk: add default case to cmd type switch
On compilation, gcc correctly detects that we do not handle
all types:

In function ‘blk_done’:
warning: enumeration value ‘REQ_TYPE_FS’ not handled in switch
warning: enumeration value ‘REQ_TYPE_SENSE’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_SUSPEND’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_RESUME’ not handled in switch
warning: enumeration value ‘REQ_TYPE_PM_SHUTDOWN’ not handled in switch
warning: enumeration value ‘REQ_TYPE_LINUX_BLOCK’ not handled in switch
warning: enumeration value ‘REQ_TYPE_ATA_TASKFILE’ not handled in switch
warning: enumeration value ‘REQ_TYPE_ATA_PC’ not handled in switch

which is a bit pointless since this is at the end of the request
processessing. Add a default case that just breaks out.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:22:26 +02:00
Christoph Hellwig 7b6d91daee block: unify flags for struct bio and struct request
Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver.  There were two flags in the bio that were
missing in the requests:  BIO_RW_UNPLUG and BIO_RW_AHEAD.  Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:20:39 +02:00
Christoph Hellwig 33659ebbae block: remove wrappers for request type/flags
Remove all the trivial wrappers for the cmd_type and cmd_flags fields in
struct requests.  This allows much easier grepping for different request
types instead of unwinding through macros.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:17:56 +02:00
Stephen Hemminger 01b6b67eda floppy: use warning macros
Convert assertions to use WARN().  There are several error checks in the
code for things that should never happen.  Convert them to standard
warnings so kerneloops.org will see them.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:43 +02:00
Stephen Hemminger b862f26fe1 floppy: use wait_event_interruptible
Convert wait loops to use wait_event_ macros.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:41 +02:00
Stephen Hemminger 21af544804 floppy: fix signed/unsigned warnings
Ioctl cmd value is unsigned, so change normalize_ioctl

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:39 +02:00
Stephen Hemminger be1c0fbfb4 floppy: cmos attribute should be static
As reported by sparse, cmos attribute is local.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:37 +02:00
Stephen Hemminger 575cfc673e floppy: use atomic type for usage_count
The usage_count was being protected by a lock which was only there to
create an atomic counter.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:36 +02:00
Stephen Hemminger 41a55b4de3 floppy: silence warning during disk test
The first thing the floppy does is read block 0 to test geometry and to
test for disk presence.  If disk is not present this causes a console
warning message about failed I/O.  Set flag to silence.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:34 +02:00
Stephen Hemminger be7a12bb1a floppy: remove unnecessary inlines
These routines are all big enough that is better to let the compiler
decide to inline or not.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:32 +02:00
Stephen Hemminger 285203c8ff floppy: initialize debug jiffies offset
Set debug jiffies offset at initialization.  Avoids wierd values showing
up if debugging enabled.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:30 +02:00
Mike Miller f3bcb14332 cciss: change pad value from 32 to 0
Change the command padding on 32-bit systems to 0 since setting it to 32
has the identical effect.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:29 +02:00
Mike Miller b0dd5cad3a cciss: remove errant debug code
Remove a debug statement left behind by accident Ths debug statement got
left behind.  It was commented out after use but not deleted.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:27 +02:00
Mike Miller 29979a7122 cciss: move next_command function from ifdef
The definition of next_command also ended up in wrong place It ended up
inside an "#ifdef CONFIG_PROCFS".  Already caught by Randy Dunlap and a
couple others.  Tried to put it somewhere that made sense.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:25 +02:00
Mike Miller b14aa6dcd0 cciss: fix call to put_controller_in_performant_mode
call to put_controller_in_performant_mode was in the wrong place
The call inadvertently ended up in an error path.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:23 +02:00
Mike Miller 256aea3fd3 cciss: make sure we request the performant mode irq
Make sure we register the performant mode interrupt Another blunder.
Seemed to work because the call to put_controller_into_performant_mode was
never called.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:15:21 +02:00
Mike Miller 841fdffdd3 cciss: new controller support and bump driver version
Add support for new controllers due out next year.  HP must continue to
support new controllers in older distros.  All vendors require support be
upstream.  These controllers support only 16 commands in simple mode but
can support up to 1024 in performant mode.  See patch 5/6/ We have no
marketing names yet.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:51 +02:00
Mike Miller 5e216153c3 cciss: add performant mode support for Stars/Sirius
Add a mode of controller operation called Performant Mode.  Even though
cciss has been deprecated in favor of hpsa there are new controllers due
out next year that HP must support in older vendor distros.  Vendors
require all fixes/features be upstream.  These new controllers support
only 16 commands in simple mode but support up to 1024 in performant mode.
This requires us to add this support at this late date.

The performant mode transport minimizes host PCI accesses by performinf
many completions per read.  PCI writes are posted so the host can write
then immediately get off the bus not waiting for the writwe to complete to
the target.  In the context of performant mode the host read out to a
controller pulls all posted writes into host memory ensuring the reply
queue is coherent.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:51 +02:00
Mike Miller 1d1414419f cciss: make interrupt access methods return type bool
Change the return type of our interrupt access routines to bool from
unsigned long.  It makes more sense that way.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:51 +02:00
Mike Miller 2cf3af1c9e cciss: check for msi in interrupt_not_for_us
Check to see if h->msi[x]_vector is set.  We need this for a following
patch.  Without this check we process one interrupt then stop because in
msi[x] mode the interrupt pending bit is not set.  Not sure why we didn't
encounter this before.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:35 +02:00
Mike Miller 0c2b39087c cciss: clean up interrupt handler
Simplify the interrupt handler code to more closely match hpsa and to
hopefully make it easier to follow.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:33 +02:00
Mike Miller 664a717d3a cciss: enqueue and submit io
Clean up some code where we subit our io.  The same 5 lines appeared
several times.  Also helps for a following patch.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-07 18:12:32 +02:00
Grant Likely 2dc1158137 of/device: Replace struct of_device with struct platform_device
of_device is just an alias for platform_device, so remove it entirely.  Also
replace to_of_device() with to_platform_device() and update comment blocks.

This patch was initially generated from the following semantic patch, and then
edited by hand to pick up the bits that coccinelle didn't catch.

@@
@@
-struct of_device
+struct platform_device

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: David S. Miller <davem@davemloft.net>
2010-08-06 09:25:50 -06:00
Linus Torvalds 552c7dbb34 Merge branch 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio_blk: Remove VBID ioctl
  virtio_blk: Add 'serial' attribute to virtio-blk devices (v2)
  virtio_blk: support barriers without FLUSH feature
2010-08-05 13:49:37 -07:00
Linus Torvalds db7a1535d2 Merge branch 'upstream/xen' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'upstream/xen' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: (23 commits)
  xen/panic: use xen_reboot and fix smp_send_stop
  Xen: register panic notifier to take crashes of xen guests on panic
  xen: support large numbers of CPUs with vcpu info placement
  xen: drop xen_sched_clock in favour of using plain wallclock time
  pvops: do not notify callers from register_xenstore_notifier
  Introduce CONFIG_XEN_PVHVM compile option
  blkfront: do not create a PV cdrom device if xen_hvm_guest
  support multiple .discard.* sections to avoid section type conflicts
  xen/pvhvm: fix build problem when !CONFIG_XEN
  xenfs: enable for HVM domains too
  x86: Call HVMOP_pagetable_dying on exit_mmap.
  x86: Unplug emulated disks and nics.
  x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock.
  implement O_NONBLOCK for /proc/xen/xenbus
  xen: Fix find_unbound_irq in presence of ioapic irqs.
  xen: Add suspend/resume support for PV on HVM guests.
  xen: Xen PCI platform device driver.
  x86/xen: event channels delivery on HVM.
  x86: early PV on HVM features initialization.
  xen: Add support for HVM hypercalls.
  ...
2010-08-05 13:45:50 -07:00
Ryan Harper 6c99a8528f virtio_blk: Remove VBID ioctl
With the availablility of a sysfs device attribute for examining disk serial
numbers the ioctl is no longer needed.  The user-space changes for this aren't
upstream yet so we don't have any users to worry about.

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-05 13:05:31 +09:30
Ryan Harper a5eb9e4ff1 virtio_blk: Add 'serial' attribute to virtio-blk devices (v2)
Create a new attribute for virtio-blk devices that will fetch the serial number
of the block device.  This attribute can be used by udev to create disk/by-id
symlinks for devices that don't have a UUID (filesystem) associated with them.

ATA_IDENTIFY strings are special in that they can be up to 20 chars long
and aren't required to be nul-terminated.  The buffer is also zero-padded
meaning that if the serial is 19 chars or less that we get a nul-terminated
string.  When copying this value into a string buffer, we must be careful to
copy up to the nul (if it present) and only 20 if it is longer and not to
attempt to nul terminate; this isn't needed.

Changes since v1:
- Added BUILD_BUG_ON() for PAGE_SIZE check
- Removed min() since BUILD_BUG_ON() handles the check
- Replaced serial_sysfs() by copying id directly to buffer

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-05 13:05:30 +09:30
Christoph Hellwig 10bc310c27 virtio_blk: support barriers without FLUSH feature
If we want to support barriers with the cache=writethrough mode in qemu
we need to tell the block layer that we only need queue drains to
implement a barrier.  Follow the model set by SCSI and IDE and assume
that there is no volatile write cache if the host doesn't advertize it.
While this might imply working barriers on old qemu versions or other
hypervisors that actually have a volatile write cache this is only a
cosmetic issue - these hypervisors don't guarantee any data integrity
with or without this patch, but with the patch we at least provide
data ordering.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-05 13:05:29 +09:30
Jiri Kosina d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Stefano Stabellini b98a409b80 blkfront: do not create a PV cdrom device if xen_hvm_guest
It is not possible to unplug emulated cdrom devices, and PV cdroms don't
handle media insert, eject and stream, so we are better off disabling PV
cdroms when running as a Xen HVM guest.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2010-07-29 11:11:08 -07:00
Stefano Stabellini c1c5413ad5 x86: Unplug emulated disks and nics.
Add a xen_emul_unplug command line option to the kernel to unplug
xen emulated disks and nics.

Set the default value of xen_emul_unplug depending on whether or
not the Xen PV frontends and the Xen platform PCI driver have
been compiled for this kernel (modules or built-in are both OK).

The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM
even if the host platform doesn't support unplug.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-26 23:13:25 -07:00
Kulikov Vasiliy 0e4a9d03df block: cciss: use ARRAY_SIZE
Change sizeof(x) / sizeof(*x) to ARRAY_SIZE(x).

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-20 17:02:03 +02:00
Pavel Machek a2531293db update email address
pavel@suse.cz no longer works, replace it with working address.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-19 10:56:54 +02:00
Uwe Kleine-König 698f93159a fix comment/printk typos concerning "already"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-11 21:45:40 +02:00
Stephen M. Cameron 79600aadcf cciss: set SCSI max cmd len to 16, as default is wrong
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: Mike Miller <mikem@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-15 08:12:34 +02:00
Jens Axboe 552618d124 cpqarray: fix two more wrong section type
cpqarray_register_ctlr() and cpqarray_eisa_detect() also
need to be marked as __devinit.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-14 15:21:33 +02:00
Jens Axboe d4a3895f5d cpqarray: fix wrong __init type on pci probe function
It needs to be __devinit, not __init.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-14 12:55:09 +02:00
Philipp Reisner dc66c74de6 drbd: Fixed a race between disk-attach and unexpected state changes
This was a very hard to trigger race condition.

If we got a state packet from the peer, after drbd_nl_disk() has
already changed the disk state to D_NEGOTIATING but
after_state_ch() was not yet run by the worker, then receive_state()
might called drbd_sync_handshake(), which in turn crashed
when accessing p_uuid.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-06-14 12:19:41 +02:00
Linus Torvalds d2dd328b7f Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits)
  block: make blk_init_free_list and elevator_init idempotent
  block: avoid unconditionally freeing previously allocated request_queue
  pipe: change /proc/sys/fs/pipe-max-pages to byte sized interface
  pipe: change the privilege required for growing a pipe beyond system max
  pipe: adjust minimum pipe size to 1 page
  block: disable preemption before using sched_clock()
  cciss: call BUG() earlier
  Preparing 8.3.8rc2
  drbd: Reduce verbosity
  drbd: use drbd specific ratelimit instead of global printk_ratelimit
  drbd: fix hang on local read errors while disconnected
  drbd: Removed the now empty w_io_error() function
  drbd: removed duplicated #includes
  drbd: improve usage of MSG_MORE
  drbd: need to set socket bufsize early to take effect
  drbd: improve network latency, TCP_QUICKACK
  drbd: Revert "drbd: Create new current UUID as late as possible"
  brd: support discard
  Revert "writeback: fix WB_SYNC_NONE writeback from umount"
  Revert "writeback: ensure that WB_SYNC_NONE writeback with sb pinned is sync"
  ...
2010-06-04 15:37:44 -07:00
Linus Torvalds 39059cceed Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/macio: Fix probing of macio devices by using the right of match table
  agp/uninorth: Fix oops caused by flushing too much
  powerpc/pasemi: Update MAINTAINERS file
  powerpc/cell: Fix integer constant warning
  powerpc/kprobes: Remove resume_execution() in kprobes
  powerpc/macio: Don't dereference pointer before null check
2010-06-03 15:46:37 -07:00
Christoph Hellwig a5b365a652 virtio-blk: fix minimum number of S/G elements
We need at least one S/G element to operate properly, as does the block
layer which increments it to one anyway.  We hit this due to a qemu
bug which advertises a sg_elements of 0 under some circumstances.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (tweaked logic)
2010-06-03 22:39:18 +09:30
Benjamin Herrenschmidt c2cdf6aba0 powerpc/macio: Fix probing of macio devices by using the right of match table
Grant patches added an of mach table to struct device_driver. However,
while he changed the macio device code to use that, he left the match
table pointer in struct macio_driver and didn't update drivers to use
the "new" one, thus breaking the probing.

This completes the change by moving all drivers to setup the "new"
one, removing all traces of the old one, and while at it (since it
changes the exact same locations), I also remove two other duplicates
from struct driver which are the name and owner fields.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-02 17:50:38 +10:00
Jens Axboe b4ca761577 Merge branch 'master' into for-linus
Conflicts:
	fs/pipe.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 12:42:12 +02:00
Dan Carpenter 713b686494 cciss: call BUG() earlier
I moved the range check after the increment.  The current code would
write past the end of the array once before calling BUG().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 12:17:48 +02:00
Philipp Reisner 2a0ab2cd73 drbd: Reduce verbosity
The "Local READ/WRITE failed" messages are too verbose.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg 7383506c87 drbd: use drbd specific ratelimit instead of global printk_ratelimit
using the global printk_ratelimit() may mask other messages.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg d255e5ff5f drbd: fix hang on local read errors while disconnected
"canceled" w_read_retry_remote never completed, if they have been
canceled after drbd_disconnect connection teardown cleanup has already
run (or we are currently not connected anyways).

Fixed by not queueing a remote retry if we already know it won't work
(pdsk not uptodate), and cleanup ourselves on "cancel", in case we hit a
race with drbd_disconnect.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Philipp Reisner 32fa7e91f9 drbd: Removed the now empty w_io_error() function
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Andrea Gelmini 039e1fb654 drbd: removed duplicated #includes
drbd/drbd_receiver.c: linux/mm.h is included more than once.

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg ba11ad9a3b drbd: improve usage of MSG_MORE
It seems to improve performance if we allow the "p_data" header in its
own frame (no MSG_MORE), but sendpage all but the last page with MSG_MORE.
This is also in preparation of a later zero copy receive implementation.

Suggested by Eduard.Guzovsky@stratus.com on drbd-dev.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg 5dbf167338 drbd: need to set socket bufsize early to take effect
quoting tcp(7):
    On individual connections, the socket buffer size must be set prior to the
    listen(2) or connect(2) calls in order to have it take effect.

This adds a wrapper to do so, and uses it appropriately.
Improves performance in certain situations.

Note that because we cannot easily determine which socket will be
"meta" and wich "data" (bulk) socket, we adjust both sockets.
Previously, DRBD only adjusted the bufsizes of the "data" socket.

Thanks again to Eduard.Guzovsky@stratus.com.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Lars Ellenberg 344fa462e3 drbd: improve network latency, TCP_QUICKACK
On Thu, Apr 29, 2010 at 04:00:50PM -0400, Eduard.Guzovsky@stratus.com
 wrote on drbd-dev@lists.linbit.com
 Subject: [Drbd-dev] DRBD small synchronous writes performance improvements

> 1. TCP_QUICKACK option is set incorrectly. The goal was force TCP to
> send and ACK as a  "one time" event.  Instead the code permanently sets
> connection in the QUICKACK mode.

He is right, we actually want to use an even val with TCP_QUICKACK.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:27 +02:00
Philipp Reisner 2c8d196759 drbd: Revert "drbd: Create new current UUID as late as possible"
The late-UUID writing is delayed until the next release.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-06-01 11:12:26 +02:00
Nick Piggin b7c335713e brd: support discard
Support discard requests in brd by zeroing or deleting the underlying backing
pages. This is simply to help with testing and documentation nature of
brd code.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-06-01 11:09:20 +02:00
Grant Likely cf9b59e9d3 Merge remote branch 'origin' into secretlab/next-devicetree
Merging in current state of Linus' tree to deal with merge conflicts and
build failures in vio.c after merge.

Conflicts:
	drivers/i2c/busses/i2c-cpm.c
	drivers/i2c/busses/i2c-mpc.c
	drivers/net/gianfar.c

Also fixed up one line in arch/powerpc/kernel/vio.c to use the
correct node pointer.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22 00:36:56 -06:00
Grant Likely 4018294b53 of: Remove duplicate fields from of_platform_driver
.name, .match_table and .owner are duplicated in both of_platform_driver
and device_driver.  This patch is a removes the extra copies from struct
of_platform_driver and converts all users to the device_driver members.

This patch is a pretty mechanical change.  The usage model doesn't change
and if any drivers have been missed, or if anything has been fixed up
incorrectly, then it will fail with a compile time error, and the fixup
will be trivial.  This patch looks big and scary because it touches so
many files, but it should be pretty safe.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Sean MacLennan <smaclennan@pikatech.com>
2010-05-22 00:10:40 -06:00
Linus Torvalds e8bebe2f71 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (69 commits)
  fix handling of offsets in cris eeprom.c, get rid of fake on-stack files
  get rid of home-grown mutex in cris eeprom.c
  switch ecryptfs_write() to struct inode *, kill on-stack fake files
  switch ecryptfs_get_locked_page() to struct inode *
  simplify access to ecryptfs inodes in ->readpage() and friends
  AFS: Don't put struct file on the stack
  Ban ecryptfs over ecryptfs
  logfs: replace inode uid,gid,mode initialization with helper function
  ufs: replace inode uid,gid,mode initialization with helper function
  udf: replace inode uid,gid,mode init with helper
  ubifs: replace inode uid,gid,mode initialization with helper function
  sysv: replace inode uid,gid,mode initialization with helper function
  reiserfs: replace inode uid,gid,mode initialization with helper function
  ramfs: replace inode uid,gid,mode initialization with helper function
  omfs: replace inode uid,gid,mode initialization with helper function
  bfs: replace inode uid,gid,mode initialization with helper function
  ocfs2: replace inode uid,gid,mode initialization with helper function
  nilfs2: replace inode uid,gid,mode initialization with helper function
  minix: replace inode uid,gid,mode init with helper
  ext4: replace inode uid,gid,mode init with helper
  ...

Trivial conflict in fs/fs-writeback.c (mark bitfields unsigned)
2010-05-21 19:37:45 -07:00
Linus Torvalds 1756ac3d3c Merge branch 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
  drivers/char: Eliminate use after free
  virtio: console: Accept console size along with resize control message
  virtio: console: Store each console's size in the console structure
  virtio: console: Resize console port 0 on config intr only if multiport is off
  virtio: console: Add support for nonblocking write()s
  virtio: console: Rename wait_is_over() to will_read_block()
  virtio: console: Don't always create a port 0 if using multiport
  virtio: console: Use a control message to add ports
  virtio: console: Move code around for future patches
  virtio: console: Remove config work handler
  virtio: console: Don't call hvc_remove() on unplugging console ports
  virtio: console: Return -EPIPE to hvc_console if we lost the connection
  virtio: console: Let host know of port or device add failures
  virtio: console: Add a __send_control_msg() that can send messages without a valid port
  virtio: Revert "virtio: disable multiport console support."
  virtio: add_buf_gfp
  trans_virtio: use virtqueue_xxx wrappers
  virtio-rng: use virtqueue_xxx wrappers
  virtio_ring: remove a level of indirection
  virtio_net: use virtqueue_xxx wrappers
  ...

Fix up conflicts in drivers/net/virtio_net.c due to new virtqueue_xxx
wrappers changes conflicting with some other cleanups.
2010-05-21 17:22:52 -07:00
Christoph Hellwig 8018ab0574 sanitize vfs_fsync calling conventions
Now that the last user passing a NULL file pointer is gone we can remove
the redundant dentry argument and associated hacks inside vfs_fsynmc_range.

The next step will be removig the dentry argument from ->fsync, but given
the luck with the last round of method prototype changes I'd rather
defer this until after the main merge window.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21 18:31:21 -04:00
Jens Axboe ee9a3607fb Merge branch 'master' into for-2.6.35
Conflicts:
	fs/ext3/fsync.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:27:26 +02:00
Philipp Reisner 4e23a59ed1 drbd: Do not free p_uuid early, this is done in the exit code of the receiver
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:01 +02:00
Philipp Reisner 23ce422748 drbd: Null pointer deref fix to the large "multi bio rewrite"
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:01 +02:00
Philipp Reisner fc8ce1941d drbd: Fix: Do not detach, if a bio with a barrier fails
Introduced a few days ago:
  commit 45bb912bd5
  Author: Lars Ellenberg <lars.ellenberg@linbit.com>
  Date:   Fri May 14 17:10:48 2010 +0200

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:00 +02:00
Philipp Reisner 4604d63668 drbd: Ensure to not trigger late-new-UUID creation multiple times
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:00 +02:00
Philipp Reisner 31a31dccdd drbd: Do not Oops when C_STANDALONE when uuid gets generated
Got introduces with

commit 0c3f34516e
Author: Philipp Reisner <philipp.reisner@linbit.com>
Date:   Mon May 17 16:10:43 2010 +0200

    drbd: Create new current UUID as late as possible

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:12:00 +02:00
David Zeuthen c3473c6354 generate "change" uevent for loop device
Recent udev versions probe loop devices for filesystems meaning that
the /dev/disk hierarchy may contain useful entries such as

 $ ls -l /dev/disk/by-label/Fedora-12-x86_64-Live
 lrwxrwxrwx 1 root root 11 Mar 11 13:41 /dev/disk/by-label/Fedora-12-x86_64-Live -> ../../loop0

Unfortunately, no "change" uevent is generated when the loop device is
detached so the symlink persists. Additionally, no "change" uevent is
guaranteed to be generated when attaching an fd or changing capacity.
For example,  user space could open the loop device O_RDONLY (in fact,
recent util-linux-ng does this) so udev's OPTIONS+="watch" machinery may
not trigger the "change" uevent.

This patch ensures that the "change" uevent is generated in all of
these cases. As a result, the /dev/disk hierarchy works as expected
for loop devices.

Signed-off-by: David Zeuthen <davidz@redhat.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:30 -07:00
Linus Torvalds f39d01be4c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
  vlynq: make whole Kconfig-menu dependant on architecture
  add descriptive comment for TIF_MEMDIE task flag declaration.
  EEPROM: max6875: Header file cleanup
  EEPROM: 93cx6: Header file cleanup
  EEPROM: Header file cleanup
  agp: use NULL instead of 0 when pointer is needed
  rtc-v3020: make bitfield unsigned
  PCI: make bitfield unsigned
  jbd2: use NULL instead of 0 when pointer is needed
  cciss: fix shadows sparse warning
  doc: inode uses a mutex instead of a semaphore.
  uml: i386: Avoid redefinition of NR_syscalls
  fix "seperate" typos in comments
  cocbalt_lcdfb: correct sections
  doc: Change urls for sparse
  Powerpc: wii: Fix typo in comment
  i2o: cleanup some exit paths
  Documentation/: it's -> its where appropriate
  UML: Fix compiler warning due to missing task_struct declaration
  UML: add kernel.h include to signal.c
  ...
2010-05-20 09:20:59 -07:00
Michael S. Tsirkin 09ec6b69d2 virtio_blk: use virtqueue_xxx wrappers
Switch virtio_blk to new virtqueue_xxx wrappers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:42 +09:30
Rusty Russell bdb4a13057 virtio_blk: remove multichar constant.
drivers/block/virtio_blk.c:228:13: warning: multi-character character constant

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: john cooper <john.cooper@redhat.com>
2010-05-19 22:15:41 +09:30
john cooper 234f2725a5 Add virtio disk identification ioctl
Return serial string to the guest application via
ioctl driver call.

Note this form of interface to the guest userland
was the consensus when the prior version using
the ATA_IDENTIFY came under dispute.

Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:40 +09:30
john cooper 4cb2ea28c5 Add virtio disk identification support
Add virtio-blk device id (s/n) support via virtio request.

Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:40 +09:30
Grant Likely 61c7a080a5 of: Always use 'struct device.of_node' to get device node pointer.
The following structure elements duplicate the information in
'struct device.of_node' and so are being eliminated.  This patch
makes all readers of these elements use device.of_node instead.

(struct of_device *)->node
(struct dev_archdata *)->prom_node (sparc)
(struct dev_archdata *)->of_node (powerpc & microblaze)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-18 16:10:44 -06:00
Linus Torvalds 1014cfe2fb Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: Reduce stack_trace usage
  lockdep: No need to disable preemption in debug atomic ops
  lockdep: Actually _dec_ in debug_atomic_dec
  lockdep: Provide off case for redundant_hardirqs_on increment
  lockdep: Simplify debug atomic ops
  lockdep: Fix redundant_hardirqs_on incremented with irqs enabled
  lockstat: Make lockstat counting per cpu
  i8253: Convert i8253_lock to raw_spinlock
2010-05-18 08:17:35 -07:00
Julia Lawall 2db4e42eac drivers/block/drbd: Use kzalloc
Use kzalloc rather than the combination of kmalloc and memset.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x,size,flags;
statement S;
@@

-x = kmalloc(size,flags);
+x = kzalloc(size,flags);
 if (x == NULL) S
-memset(x, 0, size);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:04:10 +02:00
Philipp Reisner 0c3f34516e drbd: Create new current UUID as late as possible
The choice was to either delay creation of the new UUID until
IO got thawed or to delay it until the first IO request.

Both are correct, the later is more friendly to users of
dual-primary setups, that actually only write on one side.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:49 +02:00
Philipp Reisner 9a25a04c80 drbd: If we detect late that IO got frozen, retry after we thawed.
If we detect late (= after grabing mdev->req_lock) that IO got frozen, we
return 1 to generic_make_request(), which simply will retry to make a
request for that bio.

In the subsequent call of generic_make_request() into drbd_make_request_26()
we sleep in inc_ap_bio().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:32 +02:00
Lars Ellenberg a1c88d0d7a drbd: always use_bmbv, ignore setting
Now that the peer may handle multi-bio EEs,
we can ignore the peer's limit,
and concentrate on the limits of the local IO stack.

This is safe accross drbd protocol versions,
as our queue_max_sectors() will be adjusted accordingly.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:03:05 +02:00
Lars Ellenberg bb3d000cb9 drbd: allow resync requests to be larger than max_segment_size
this should allow for better background resync performance.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:02:36 +02:00
Lars Ellenberg 45bb912bd5 drbd: Allow drbd_epoch_entries to use multiple bios.
This should allow for better performance if the lower level IO stack
of the peers differs in limits exposed either via the queue,
or via some merge_bvec_fn.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 02:01:23 +02:00
Lars Ellenberg 708d740ed8 drbd: reduce sizeof struct drbd_epoch_entry by 8 byte by aligning members
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:28:35 +02:00
Philipp Reisner 162f3ec7f0 drbd: Fixes to the new delay_probes code
* Only send delay_probes with protocol 93 or newer
* drbd_send_delay_probes() is called only from worker context,
  no atomic_t needed for delay_seq

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:28:08 +02:00
Philipp Reisner a8cdfd8d3b drbd: A fixes to the new resync speed code
* Mention P_DELAY_PROBE in the packet naming array
* Do not corrupt the mdev->data.work list in case the timer goes
  off before delay_probe_work got handled by the worker
* Do not mod_timer() twice for a single delay_probe pair

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:26:51 +02:00
Philipp Reisner eedf386ae9 drbd: Proc bits of new resync speed stuff
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:26:27 +02:00
Philipp Reisner cdd67a7460 drbd: Control the actual resync rate based on the queuing delay of data packets
In a setup with a high bandwidth and high latency network, eventually
involving deep queues in routers, it is beneficial to only fill those
queues up to an limited extend with resync data.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:47 +02:00
Philipp Reisner bd26bfc5b4 drbd: Actually send delay probes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:28 +02:00
Philipp Reisner 67c7ddd055 drbd: Four new configuration settings for resync speed control
To reasonably control resync speed over drbd-proxy connections,
drbd has to measure the current delay of packets transmitted over
the (possibly congested) data socket vs the meta-data socket.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:00 +02:00
Philipp Reisner 7237bc430f drbd: Sending of delay_probes
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:22:46 +02:00
Philipp Reisner 0ced55a3be drbd: Receiving of delay_probes
Delay_probes are new packets in the DRBD protocol, which allow
DRBD to know the current delay packets have on the data socket.
(relative to the meta data socket)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:22:11 +02:00
Philipp Reisner 5223671bb0 drbd: Fixed bitmap in case of online-grow without resync
The "surplus" bits of the old (smaller) bitmap must be clean
in case of online-grow without resync.

Note: Reverted 67ae8b80d4a116ab3b7094eb3723506b20c06dff as
well, since the lines added by this patch are redundant. The
bits get set by the bm_set_surplus(b) call before that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:20:33 +02:00
Philipp Reisner 6b4388ac1f drbd: Added transmission faults to the fault injection code
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:19:51 +02:00
Philipp Reisner 087c24925c drbd: bugfix: Make resize work, if remote's size was limiting and increased in the meantime
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:18:22 +02:00
Philipp Reisner 6495d2c6d0 drbd: Implemented the --assume-clean option for drbdsetup resize
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:17:47 +02:00
Philipp Reisner b4ee79dac3 drbd: Added some missing statics
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:17:11 +02:00
Philipp Reisner fd76438c24 drbd: Make sure to resync all of the new storage upon online resize
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:16:20 +02:00
Philipp Reisner e89b591c3a drbd: Implemented flags for the resize packet
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:15:44 +02:00
Philipp Reisner 02d9a94bbb drbd: Implemented the set_new_bits parameter for drbd_bm_resize()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:14:43 +02:00
Philipp Reisner d845030f21 drbd: made determin_dev_size's parameter an flag enum
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:14:04 +02:00
Adam Gandelman 3a11a48789 drbd: New handler: initial-split-brain
Some wish to be notified of all instances of split brain, not just those that
go unresolved.  The initial-split-brain handler is called to notify someone
upon  detection of all split brain conditions even if auto-recovery policies
are configured.

Signed-off-by: Adam Gandelman <adam.gandelman@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:13:33 +02:00
Lars Ellenberg 979f5c7f1f drbd: fail_requests_early: remove incorrect and unnecessary optimization
The condition does not fit the commend (I may well be Primary,
even if I lost the disk earlier and now the connection).

And this is catched below anyways, where it also gets logged.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:10:31 +02:00
Lars Ellenberg 6666032ade drbd: check for corrupt or malicous sector addresses when receiving data
Even if it should never happen if the peer does behave, we need to
double check, and not even attempt access beyond end of device.
It usually would be caught by lower layers, resulting in "IO error",
but may also end up in the internal meta data area.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:09:57 +02:00
Philipp Reisner c3fe30b0e7 drbd: cleanup: This code path to trigger a resync is no longer needed
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:09:13 +02:00
Lars Ellenberg 8d4ce82b3c drbd: don't start a resync without access to up-to-date Data
In case both nodes are "inconsistent", invalidate would
have started a resync anyways, without a chance to ever
succeed, just filling the logs with warning messages.

Simply disallow that state change,
re-using the SS_NO_UP_TO_DATE_DISK return value.

This also changes the corresponding error string to
"Need access to UpToDate Data" -- I found the
"Refusing to be Primary without at least one UpToDate disk"
answer misleading in some situations anyways.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:08:18 +02:00
Lars Ellenberg c3470cde57 drbd: fix potential protocol error
Don't forget to drain the digest in case we cannot satisfy a
checksum based resync or online-verify request.

It would additionally cause a protocoll error,
dropping the connection.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:07:38 +02:00
Lars Ellenberg 8d1894ebe4 drbd: remove bogus ASSERT
block_id may be ID_SYNCER,
as well as checksum based resync request magic, or online verify magic.

Let's just drop that ASSERT.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:06:59 +02:00
Lars Ellenberg e0f83012dc drbd: fix regression: attach while connected failed
commit e4f925e12e
Author: Philipp Reisner <philipp.reisner@linbit.com>
Date:   Wed Mar 17 14:18:41 2010 +0100

    drbd: Do not upgrade state to Outdated if already Inconsistent

prevented the necessary state transition for attaching while connected
(Diskless -> Consistent respectively Outdated).
This is the fix for the fix.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:06:07 +02:00
Philipp Reisner e4f925e12e drbd: Do not upgrade state to Outdated if already Inconsistent [Bugz 277]
There was a race condition:
  In a situation with a SyncSource+Primary and a SyncTarget+Secondary node,
  and a resync dependency to some other device. After both nodes decided
  to do the resync, the other device finishes its resync process.
  At that time SyncSource already sent the P_SYNC_UUID packet, and
  already updated its peer disk state to Inconsistent.
  The SyncTarget node waits for the P_SYNC_UUID and sends a state packet
  to report the resync dependency change. That packet still carries
  a disk state of Outdated.

Impact:
  If application writes come in, during that time on the Primary node,
  those do not get replicated, and the out-of-sync counter gets increased.
  => The completion of resync is not detected on the primary node.
  => stalled.
  Those blocks get resync'ed with the next resync, since the are get
  marked as out-of-sync in the bitmap.

In order to fix this, we filter out that wrong state change in the
sanitize_state() function.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:01:05 +02:00
Lars Ellenberg 8c484ee491 drbd: use proc_create_data with explicit NULL argument
To document that we know about deprecation of proc_create,
even though we are not affected, as we don't use the ->data member,
open code proc_create_data(..., NULL);

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 00:59:00 +02:00
Geert Uytterhoeven 92183b346f m68k: amiga - Floppy platform device conversion
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2010-05-17 21:37:45 +02:00
Bill Pemberton c2d45b4da0 cciss: fix shadows sparse warning
Fix sparse warnings:

drivers/block/cciss.c:1591:37: warning: symbol 'i' shadows an earlier one
drivers/block/cciss.c:2437:21: warning: symbol 'i' shadows an earlier one

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-05-11 09:59:26 +02:00
Randy Dunlap 2395e463fe paride: fix menu indentation
Make the PARIDE menu be displayed correctly, with proper/expected
indentation, by moving the GDROM kconfig symbol, which was
splitting the PARIDE kconfig symbol from its dependent symbols.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-11 09:02:55 +02:00
Jens Axboe 6a7cc883d6 Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-linus 2010-05-04 08:48:53 +02:00
Lars Ellenberg 5c3c7e64bb drbd: don't expose failed local READ to upper layers
fix regression introduced in 8.3.3:
 commit a9b17323f2875f5d9b132c2b476a750bf44b10c7
 Author: Lars Ellenberg <lars.ellenberg@linbit.com>
 Date:   Wed Aug 12 15:18:33 2009 +0200

     out-of-spinlock completion of master bio

 : (bio_rw(bio) == READA)
    ? read_completed_with_error
    : read_ahead_completed_with_error;

is obviously not what was intended.

No one noticed because of
 * page-cache at work,
 * local RAIDs

Impact:
Failed local READs are not retried remotely,
but errored to upper layers, causing filesystems
to remount read-only, or worse.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-03 22:40:16 +02:00
Ingo Molnar 53ba4f2fa7 Merge commit 'v2.6.34-rc6' into core/locking 2010-05-03 09:17:01 +02:00
Arnd Bergmann f80a0ca6ad pktcdvd: improve BKL and compat_ioctl.c usage
The pktcdvd driver uses proper locking and does not need the BKL in the
ioctl and llseek functions of the character device, so kill both.

Moving the compat_ioctl handling from common code into the driver itself
fixes build problems when CONFIG_BLOCK is disabled.

Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-29 08:44:37 -07:00
Jens Axboe 7407cf355f Merge branch 'master' into for-2.6.35
Conflicts:
	fs/block_dev.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-29 09:36:24 +02:00
Dmitry Monakhov fbd9b09a17 blkdev: generalize flags for blkdev_issue_fn functions
The patch just convert all blkdev_issue_xxx function to common
set of flags. Wait/allocation semantics preserved.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-28 19:47:36 +02:00
Philipp Reisner 7e2455c1a1 drbd: Terminate a connection early if sending the protocol fails
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-22 14:50:23 +02:00
Dan Carpenter 7ac314c82f drbd: fix memory leak
We leak memory if "--dry-run" is not supported by the peer.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-22 14:27:23 +02:00
Linus Torvalds 2f4084209a Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
  cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
  loop: Update mtime when writing using aops
  block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
  backing-dev: Handle class_create() failure
  Block: Fix block/elevator.c elevator_get() off-by-one error
  drbd: lc_element_by_index() never returns NULL
  cciss: unlock on error path
  cfq-iosched: Do not merge queues of BE and IDLE classes
  cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
  i2o: Remove the dangerous kobj_to_i2o_device macro
  block: remove 16 bytes of padding from struct request on 64bits
  cfq-iosched: fix a kbuild regression
  block: make CONFIG_BLK_CGROUP visible
  Remove GENHD_FL_DRIVERFS
  block: Export max number of segments and max segment size in sysfs
  block: Finalize conversion of block limits functions
  block: Fix overrun in lcm() and move it to lib
  vfs: improve writeback_inodes_wb()
  paride: fix off-by-one test
  drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
  ...
2010-04-09 11:50:29 -07:00
Nikanth Karthikesan 02246c4117 loop: Update mtime when writing using aops
Update mtime when writing to backing filesystem using the address space
operations write_begin and write_end.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-08 21:39:31 +02:00
Dan Carpenter 829f46af39 cciss: unlock on error path
We take the spin_lock again in fail_all_cmds() so we need to unlock here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Steve Cameron <scameron@beardog.cce.hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07 08:38:03 -07:00
Philipp Reisner b2b163dd47 drbd: lc_element_by_index() never returns NULL
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-02 08:40:33 +02:00
Dan Carpenter 61917bdaaf cciss: unlock on error path
We take the spin_lock again in fail_all_cmds() so we need to unlock
here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-02 08:39:40 +02:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Jens Axboe b4b7a4ef09 Merge branch 'master' into for-linus
Conflicts:
	block/Kconfig

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-19 08:05:10 +01:00
Martin K. Petersen ee714f2dd3 block: Finalize conversion of block limits functions
Remove compatibility wrappers and update remaining drivers.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-15 12:47:59 +01:00
Linus Torvalds c32da02342 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (56 commits)
  doc: fix typo in comment explaining rb_tree usage
  Remove fs/ntfs/ChangeLog
  doc: fix console doc typo
  doc: cpuset: Update the cpuset flag file
  Fix of spelling in arch/sparc/kernel/leon_kernel.c no longer needed
  Remove drivers/parport/ChangeLog
  Remove drivers/char/ChangeLog
  doc: typo - Table 1-2 should refer to "status", not "statm"
  tree-wide: fix typos "ass?o[sc]iac?te" -> "associate" in comments
  No need to patch AMD-provided drivers/gpu/drm/radeon/atombios.h
  devres/irq: Fix devm_irq_match comment
  Remove reference to kthread_create_on_cpu
  tree-wide: Assorted spelling fixes
  tree-wide: fix 'lenght' typo in comments and code
  drm/kms: fix spelling in error message
  doc: capitalization and other minor fixes in pnp doc
  devres: typo fix s/dev/devm/
  Remove redundant trailing semicolons from macros
  fix typo "definetly" -> "definitely" in comment
  tree-wide: s/widht/width/g typo in comments
  ...

Fix trivial conflict in Documentation/laptops/00-INDEX
2010-03-12 16:04:50 -08:00
Joe Perches 724ee626f3 drivers/block/floppy.c: remove unnecessary casting in fd_ioctl
Convert outparam to const void *.
Cast outparam to const char * for strlen().

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 0aad92cfea drivers/block/floppy.c: remove misleading, used once FD_IOCTL_ALLOWED macro
Just code the test directly

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 712e1de43e drivers/block/floppy.c: remove obfuscating CODE2SIZE macro
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches ded2863d09 drivers/block/floppy.c: add __func__ to debugt
Make debugt messages a little neater.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 7f2527174a drivers/block/floppy.c: convert raw_cmd_copyin from while(1) to label: goto
Reduces indent.
Makes a bit more readable and intelligible.
Return value now at bottom of function.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches ce2f11fe78 drivers/block/floppy.c: remove some unnecessary casting
Remove char/void __user * use.
Remove kmalloc cast.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 1ebddd85a6 drivers/block/floppy.c: use %pf in logging messages
Print the function name not the pointer address where useful and possible

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 275176bc2a drivers/block/floppy.c: use __func__ where appropriate
Add and use __func__ to is_alive.
Use __func__ in some DPRINTs.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 891eda80a5 drivers/block/floppy.c: DPRINT neatening
Move DPRINT macro definition above 1st use Consolidate a format string
(>80 columns) Add a newline to an unterminated message Comment neatened

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 1a23d13335 drivers/block/floppy.c: remove #define FLOPPY_SANITY_CHECK
The code could not be compiled without the #define, so just remove it and
the #ifdef/#endif lines.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 73507e6cd8 drivers/block/floppy.c: remove unnecessary argument from [__]reschedule_timeout
Prior to patch "drivers/block/floppy.c: Use pr_<level>" only
reschedule_timeout(,"request done"...) printed a numeric value after a
reschedule_timeout event message.

Restore that behavior and remove the now unnecessary argument.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 0da3132f90 drivers/block/floppy.c: unclutter redo_fd_request logic
Change for(;;) with continue; to label: goto label
Reduces indentation and adds a bit of clarity.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:31 -08:00
Joe Perches 416d8d2888 drivers/block/floppy.c: remove REPEAT macro
Macros with hidden flow changes aren't nice.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 15b2630c58 drivers/block/floppy.c: remove unnecessary return and braces
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 57584c5a38 drivers/block/floppy.c: add function is_ready_state
Used a couple of times, might simplify the code a bit.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 29f1c7848f drivers/block/floppy.c: convert int initialising to bool initialized
Don't initialize initialized either.  Default is false.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 4d18ef09df drivers/block/floppy.c: remove #define DEVICE_NAME "floppy"
Use it directly in the one place it's used.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches c529730a98 drivers/block/floppy.c: move leading && and || to preceding line
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 74f63f469e drivers/block/floppy.c: convert int 1/0 to bool true/false
Various functions use int where bool is appropriate
lock_fdc, wait_til_done, poll_drive, user_reset_fdc

Convert to bool.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 55eee80c62 drivers/block/floppy.c: remove macros CALL, WAIT and IWAIT
Obfuscating macros with embedded returns are not nice

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 86b12b48a2 drivers/block/floppy.c: remove [_]COPYIN [_]COPYOUT and ECALL macros
Remove these obfuscating macros with hidden returns

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 4575b55281 drivers/block/floppy.c: remove most uses of CALL and ECALL macros
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches e029853612 drivers/block/floppy.c: remove [U]CLEARF, [U]SETF, and [U]TESTF macros
Use clear_bit, set_bit, and test_bit functions directly

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 87f530d8f1 drivers/block/floppy.c: add debug_dcl(...) macro
Converted #ifdef DCL_DEBUG if (test) DPRINTK(...); #endif
to debug_dcl(test, ...);

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:30 -08:00
Joe Perches 52a0d61f64 drivers/block/floppy.c: remove macro LOCK_FDC
Macros with hidden returns aren't nice.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches a0a52d67de drivers/block/floppy.c: remove a few spaces from function casts
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches da27365342 drivers/block/floppy.c: remove IN/OUT macros, indent switch/case
Remove ugly IN/OUT macros, use direct case and code
Add missing semicolon after ECALL

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches 96534f1dd5 drivers/block/floppy.c: indent a comment
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches b87c9e0a88 drivers/block/floppy.c: remove CLEARSTRUCT macro, use memset
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches bb57f0c662 drivers/block/floppy.c: comment neatening and remove naked ;
Spacing, column alignment and a for loop with
a naked semicolon converted to an assign and while

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches 2300f90e31 drivers/block/floppy.c: remove LAST_OUT macro
Macros with hidden returns are not nice.
Convert the 2 uses to use direct code.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches d7b2b2ecd8 drivers/block/floppy.c: hoist assigns from if()s, neatening
Move assigns above if()s
Remove unnecessary parentheses from returns
Use a temporary for a duplicated test

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches 045f983630 drivers/block/floppy.c: remove used once CHECK_READY macro
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches a81ee54471 drivers/block/floppy.c: remove unnecessary braces
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches b46df356de drivers/block/floppy.c: use pr_<level>
Convert bare printk to pr_info and pr_cont
Convert printk(KERN_ERR to pr_err

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches 48c8cee61f drivers/block/floppy.c: #define space and column neatening
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:29 -08:00
Joe Perches d49375434e drivers/block/floppy.c: convert some #include <asm/ to #include <linux/
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:28 -08:00
Roel Kluin c12ec0a2d9 paride: fix off-by-one test
With `while (j++ < PX_SPIN)' j reaches PX_SPIN + 1 after the loop.  This
is probably unlikely to produce a problem.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-12 10:03:42 +01:00
Lars Ellenberg 39ad2bbb59 drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
Up to now, applying the in-core activity-log to the on-disk
bitmap did not care for logical_block_size.

On logical_block_size != 512 byte, this very likely results
in misalligned block access and spurious "io errors".

We now simply always submit aligned whole 4k blocks, fixing this
for logical block sizes of 512, 1024, 2048 and 4096.

For even larger logical block sizes, this won't work.
But I'm not aware of devices with such properties being available.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:33:46 +01:00
Philipp Reisner 1f55243024 drbd: Renamed overwrite_peer to primary_force
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:32:14 +01:00
Philipp Reisner d10a33c68b drbd: Forcing primary should also work for Consistent disks [Bugz 266]
Up to now this only worked for Outdated and Inconsistent disks, that
it did not worked for Consistent disks was an inconsistent omission.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:12:35 +01:00
Philipp Reisner d0c3f60f36 drbd: Make sure we do not send state updates during an empty resync [Bugz 271]
This is a race condition that existed for ages.
The previous commit reduces the window, this one closes it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:10:40 +01:00
Philipp Reisner 309d1608cc drbd: Reduce the time an empty resync takes usually
This mitigates changes introduced with commit:
http://git.drbd.org/?p=drbd-8.3.git;a=commit;h=4b6803a3276652da3737

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:09:03 +01:00
Lars Ellenberg c42b6cf4b3 drbd: add missing drbd command names to avoid <NULL> in error messages
cmdname() should map command number to its human readable
representation. The string table was incomplete, though.

Maybe rather do a switch() block, and let the compiler help us
to keep it complete?

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:04:05 +01:00
Lars Ellenberg 4589d7f829 drbd_disconnect: grab meta.socket mutex as well
Fixes a race and potential kernel panic if e.g. the worker was just
about to send a few P_RS_IS_IN_SYNC via the meta socket for checksum
based resync, while the receiver destroys the sockets in
drbd_disconnect.

To make sure no-one is using the meta socket,
it is not enough to stop the asender...
Grab the meta socket mutex before destroying it.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:02:45 +01:00
Lars Ellenberg 676396d545 fix unit of rs_same_csums accounting
Depending on resync request size,
we need to account for more than one bit.

Impact: cosmetic

If SyncTarget reported correctly 100% equal checksums,
the SyncSource usually reported 12% equal checksums instead,
because it only counted requests, we typically do 32k resync requests,
and the bitmap granularity is still 4k.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:01:38 +01:00
Lars Ellenberg 580b9767db drbd: fix broken state change after split-brain attach while connected
Situation:
we have diverging data sets, i.e. we had a split brain somewhen,
but currently are connected, one node diskless.

Then we try to attach that disk, figure it is consistent,
but has a diverging data set, we refuse to attach.

This led to strange state changes:
22:18:35 bb drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> Connected) pdsk( DUnknown -> UpToDate )
22:19:30 bb drbd1: disk( Diskless -> Attaching )
22:19:30 bb drbd1: disk( Attaching -> Negotiating )
22:19:30 bb drbd1: drbd_sync_handshake:
22:19:30 bb drbd1: self 97BF25798B9D5222:F33D1F62ADE698DD:4269796F9D027C83:AC45D8B5C3C1BF93 bits:19449 flags:0
22:19:30 bb drbd1: peer 280DFB6E125465D3:F33D1F62ADE698DC:4269796F9D027C82:AC45D8B5C3C1BF93 bits:2575806 flags:0
22:19:30 bb drbd1: uuid_compare()=100 by rule 90
22:19:30 bb drbd1: Split-Brain detected, dropping connection!
22:19:30 bb drbd1: disk( Negotiating -> Diskless )

while the other side says:
22:19:30 aa drbd1: Split-Brain detected, dropping connection!
22:19:30 aa drbd1: Disk attach process on the peer node was aborted.
22:19:30 aa drbd1: conn( Connected -> TOO_LARGE ) pdsk( Diskless -> Consistent )

This should be fixed now.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 16:00:09 +01:00
Lars Ellenberg 4aa83b7bf1 drbd: fix NULL pointer dereference on 4k hard sect size
we still don't support 4k 'physical' sectors 'natively',
but use a read-modify-write workaround.
And we even tried to use the extra page before we allocated it :(

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 15:58:25 +01:00
Philipp Reisner cf14c2e987 drbd: --dry-run option for drbdsetup net ( drbdadm -- --dry-run connect <res> )
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-03-11 15:51:23 +01:00
Thomas Gleixner 8a03ae2a5b block: drbd: Convert semaphore to mutex
The bm_change semaphore is semantically a mutex. Convert it to a real
mutex.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2010-03-11 13:30:16 +01:00
Jiri Kosina 318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
Emese Revfy 52cf25d0ab Driver core: Constify struct sysfs_ops in struct kobj_type
Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Andi Kleen 28812fe11a driver-core: Add attribute argument to class_attribute show/store
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.

Also drivers can extend the attributes with own data fields
and use that in the low level function.

This makes the class attributes the same as sysdev_class attributes
and plain attributes.

This will allow further cleanups in drivers.

Full tree sweep converting all users.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Thomas Gleixner ced918eb74 i8253: Convert i8253_lock to raw_spinlock
i8253_lock needs to be a real spinlock in preempt-rt, i.e. it can
not be converted to a sleeping lock.

Convert it to raw_spinlock and fix up all users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
LKML-Reference: <20100217163751.030764372@linutronix.de>
2010-03-02 10:28:38 +01:00
Linus Torvalds b1bf936840 Merge branch 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block: (38 commits)
  block: don't access jiffies when initialising io_context
  cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds
  block: fix for "Consolidate phys_segment and hw_segment limits"
  cfq-iosched: quantum check tweak
  blktrace: perform cleanup after setup error
  blkdev: fix merge_bvec_fn return value checks
  cfq-iosched: requests "in flight" vs "in driver" clarification
  cciss: Fix problem with scatter gather elements in the scsi half of the driver
  cciss: eliminate unnecessary pointer use in cciss scsi code
  cciss: do not use void pointer for scsi hba data
  cciss: factor out scatter gather chain block mapping code
  cciss: fix scatter gather chain block dma direction kludge
  cciss: simplify scatter gather code
  cciss: factor out scatter gather chain block allocation and freeing
  cciss: detect bad alignment of scsi commands at build time
  cciss: clarify command list padding calculation
  cfq-iosched: rethink seeky detection for SSDs
  cfq-iosched: rework seeky detection
  block: remove padding from io_context on 64bit builds
  block: Consolidate phys_segment and hw_segment limits
  ...
2010-03-01 09:00:29 -08:00
Stephen Rothwell 91f63d0efa block: fix for "Consolidate phys_segment and hw_segment limits"
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-01 10:43:39 +01:00
Stephen M. Cameron 87c3a922a7 cciss: Fix problem with scatter gather elements in the scsi half of the driver
cciss: Fix problem with scatter gather elements in the scsi half of the driver
When support for more than 31 scatter gather elements was added to the block
half of the driver, the SCSI half of the driver was not addressed, and the bump
from 31 to 32 scatter gather elements in the command block itself (not chained)
actually broke the SCSI half of the driver, so that any transfer requiring 32
scatter gather elements wouldn't work.  This fix also increases the max transfer
size and size of the scatter gather table to the limit supported by the controller

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:32 +01:00
Stephen M. Cameron bf88737818 cciss: eliminate unnecessary pointer use in cciss scsi code
cciss: eliminate unnecessary pointer use in cciss scsi code
An extra level of indirection was being used in some places
for no real reason.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:32 +01:00
Stephen M. Cameron aad9fb6f2c cciss: do not use void pointer for scsi hba data
cciss: do not use void pointer for scsi hba data
and get rid of related unnecessary type casting
and delete some superfluous and misleading comments nearby.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:32 +01:00
Stephen M. Cameron d45033ef56 cciss: factor out scatter gather chain block mapping code
cciss: factor out scatter gather chain block mapping code
Rationale is I want to use this code from the scsi half of the
driver.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:32 +01:00
Stephen M. Cameron 2ad6cdc20f cciss: fix scatter gather chain block dma direction kludge
cciss: fix scatter gather chain block dma direction kludge
The data direction for the chained block of scatter gather
elements should always be PCI_DMA_TODEVICE, but was mistakenly
set to the direction of the data transfer, then a kludge to
fix it was added, in which pci_dma_sync_single_for_device or
pci_dma_sync_single_for_cpu was called.  If the correct direction
is used in the first place, the kludge isn't needed.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:31 +01:00
Stephen M. Cameron dccc9b563e cciss: simplify scatter gather code
cciss: simplify scatter gather code.
Instead of allocating an array of pointers to a structure
containing an SGDescriptor structure, and two other elements
that aren't really used, just allocate SGDescriptor structs.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:31 +01:00
Stephen M. Cameron 49fc5601ea cciss: factor out scatter gather chain block allocation and freeing
cciss: factor out scatter gather chain block allocation and freeing
Rationale is that I want to use this code from the scsi half of the
driver.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:31 +01:00
Stephen M. Cameron 1b7d0d28ad cciss: detect bad alignment of scsi commands at build time
cciss: detect bad alignment of scsi commands at build time
Incidentally fix some nearby c++ style comments.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:31 +01:00
Stephen M. Cameron 58daa9ce96 cciss: clarify command list padding calculation
cciss: clarify command list padding calculation

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-28 19:42:31 +01:00
Linus Torvalds 847f9c606c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: (24 commits)
  m68k: Define sigcontext ABI of ColdFire
  m68knommu: NPTL support for uClinux
  m68k: Add NPTL support
  m68k: Eliminate unused variable in page_to_phys()
  m68k: Switch to generic siginfo layout
  macfb: fix 24-bit visual and stuff
  macfb: cleanup
  fbdev: add some missing mac modes
  mac68k: start CUDA early
  valkyriefb: various fixes
  fbdev: mac_var_to_mode() fix
  mac68k: move macsonic and macmace platform devices
  mac68k: move mac_esp platform device
  mac68k: replace mac68k SCC code with platform device
  pmac-zilog: add platform driver
  pmac-zilog: cleanup
  mac68k: rework SWIM platform device
  mac68k: cleanup
  ataflop: Killl warning about unused variable flags
  m68k: Use DIV_ROUND_CLOSEST
  ...
2010-02-27 16:22:47 -08:00
Finn Thain 2724daf439 mac68k: rework SWIM platform device
Adjust the platform device code to conform with the code style used in the
rest of this patch series. No need to name resources nor to register
devices which are not applicable.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2010-02-27 18:27:15 +01:00
Geert Uytterhoeven 41fb11ca90 ataflop: Killl warning about unused variable flags
After commit e0c0978699 ("ataflop: remove
buggy/commented-out IRQ disable from do_fd_request()") the `flags' variable
became unused:

drivers/block/ataflop.c:1473: warning: unused variable 'flags'

Hence remove it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2010-02-27 18:27:15 +01:00
Martin K. Petersen 8a78362c4e block: Consolidate phys_segment and hw_segment limits
Except for SCSI no device drivers distinguish between physical and
hardware segment limits.  Consolidate the two into a single segment
limit.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-26 13:58:08 +01:00
Martin K. Petersen 086fa5ff08 block: Rename blk_queue_max_sectors to blk_queue_max_hw_sectors
The block layer calling convention is blk_queue_<limit name>.
blk_queue_max_sectors predates this practice, leading to some confusion.
Rename the function to appropriately reflect that its intended use is to
set max_hw_sectors.

Also introduce a temporary wrapper for backwards compability.  This can
be removed after the merge window is closed.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-26 13:58:08 +01:00
Martin K. Petersen eb28d31bc9 block: Add BLK_ prefix to definitions
Add a BLK_ prefix to block layer constants.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-26 13:58:08 +01:00
Benjamin Herrenschmidt 874f2f997d Merge commit 'origin/master' into next
Manual merge of:
	drivers/char/hvc_console.c
	drivers/char/hvc_console.h
2010-02-26 14:41:00 +11:00
Akinobu Mita c5ecc484c5 pktcdvd: use BIO list management functions
Now that the bio list management stuff is generic, convert pktcdvd to
use bio lists instead of its own private bio list implementation.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Peter Osterlund <petero2@telia.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-24 08:30:08 +01:00
Christoph Hellwig 69740c8ba8 virtio_blk: add block topology support
Allow reading various alignment values from the config page.  This
allows the guest to much better align I/O requests depending on the
storage topology.

Note that the formats for the config values appear a bit messed up,
but we follow the formats used by ATA and SCSI so they are expected in
the storage world.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 14:22:26 +10:30
Jens Axboe f11cbd74c5 Merge branch 'master' into for-2.6.34 2010-02-22 13:48:51 +01:00
dann frazier 429c42c9d2 cciss: Consolidate duplicate bits in cciss_cmd.h & cciss_ioctl.h
There are several duplicate definitions in cciss_cmd.h and cciss_ioctl.h.
Consolidate these into the new cciss_defs.h file. This patch doesn't change
the definitions exposed under include/linux, so userspace apps shouldn't
be affected.

Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-22 13:44:45 +01:00
dann frazier b028461d66 cciss: remove C99-style comments
Some cleanup before the header file split-out so we don't propagate this style
into new files.

Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-22 13:44:45 +01:00
Benjamin Herrenschmidt ec144a81ad Merge commit 'origin/master' into next 2010-02-17 10:00:42 +11:00
Daniel Mack 3ad2f3fbb9 tree-wide: Assorted spelling fixes
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-09 11:13:56 +01:00
Stephen M. Cameron 531c2dc70d cciss: Make cciss_seq_show handle holes in the h->drv[] array
It is possible (and expected) for there to be holes in the h->drv[]
array, that is, some elements may be NULL pointers.  cciss_seq_show
needs to be made aware of this possibility to avoid an Oops.

To reproduce the Oops which this fixes:

1) Create two "arrays" in the Array Configuratino Utility and
   several logical drives on each array.
2) cat /proc/driver/cciss/cciss* in an infinite loop
3) delete some of the logical drives in the first "array."

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-05 13:15:36 +01:00
Geert Uytterhoeven c5c7b32d3c ataflop: Killl warning about unused variable flags
After commit e0c0978699 ("ataflop: remove
buggy/commented-out IRQ disable from do_fd_request()") the `flags' variable
became unused:

drivers/block/ataflop.c:1473: warning: unused variable 'flags'

Hence remove it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-04 11:55:44 +01:00
Joe Perches dc942cee2f powerpc/viodasd: Remove VIOD_KERN_<level> macros for printks
Use #define pr_fmt(fmt) "viod: " fmt
Remove #define VIOD_KERN_WARNING and VIOD_KERN_INFO
Convert printk(VIOD_KERN_<level> to pr_<level>
Coalesce long format strings

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>

 drivers/block/viodasd.c |   86 +++++++++++++++++++---------------------------
 1 files changed, 36 insertions(+), 50 deletions(-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-03 17:39:48 +11:00
Thadeu Lima de Souza Cascardo ca0bf64d99 pktcdvd: removing device does not remove its sysfs dir
This is the counterpart to cba767175b
("pktcdvd: remove broken dev_t export of class devices").  Device is not
registered using dev_t, so it should not be destroyed using device_destroy
which looks up the device by dev_t.  This will fail and adding the device
again will fail with the "duplicate name" error.  This is fixed using
device_unregister instead of device_destroy.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-02 18:11:23 -08:00
Dan Carpenter d3db7b485a drbd: null dereference bug
epoch is always NULL here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2010-01-25 18:01:41 +01:00
Lars Ellenberg 98ec286e01 drbd: fix max_segment_size initialization
blk_queue_make_request() internally calls blk_set_default_limits(),
so calling blk_queue_max_segment_size() before is useless.
Ergo: move the call to blk_queue_max_segment_size() down a few lines.

Impact:
If, after a fresh modprobe, you first connect a Diskless drbd,
then attach, this could result in a DRBD Protocol Error at first.
The next connection attempt would then succeeded.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-01-22 11:34:54 +01:00
Philipp Reisner a393db6f10 drbd: Allow online resizing of DRBD devices while peer not reachable (needs to be explicitly forced)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-01-12 10:02:46 +01:00
Johannes Thoma b10d96cb9c drbd: Don't go into StandAlone mode when authentification failes because of network error
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-01-12 09:38:27 +01:00
Márton Németh 47483e2520 block: make virtio device id constant
The id_table field of the struct virtio_driver is constant in <linux/virtio.h>
so it is worth to make id_table also constant.

The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
	struct I1 {
	  ...
	  const struct I2 *x;
	  ...
	};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
	struct I1 y = {
	  .x = E,
	};
@c@
identifier r.I2;
identifier s.E;
@@
	const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+	const
	struct I2 E[] = ...;
// </smpl>

Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-01-11 14:31:27 +01:00
Márton Németh ec9c42ec79 block: make xenbus device id constant
The ids field of the struct xenbus_device_id is constant in <linux/xen/xenbus.h>
so it is worth to make blkfront_ids also constant.

The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
	struct I1 {
	  ...
	  const struct I2 *x;
	  ...
	};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
	struct I1 y = {
	  .x = E,
	};
@c@
identifier r.I2;
identifier s.E;
@@
	const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+	const
	struct I2 E[] = ...;
// </smpl>

Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-01-11 14:31:27 +01:00
Márton Németh 5cccfd9b3a block: make Open Firmware device id constant
The match_table field of the struct of_device_id is constant in <linux/of_platform.h>
so it is worth to make ace_of_match also constant.

The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
	struct I1 {
	  ...
	  const struct I2 *x;
	  ...
	};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
	struct I1 y = {
	  .x = E,
	};
@c@
identifier r.I2;
identifier s.E;
@@
	const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+	const
	struct I2 E[] = ...;
// </smpl>

Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-01-11 14:31:27 +01:00
Márton Németh 577cdf0cf5 block: make USB device id constant
The id_table field of the struct usb_device_id is constant in <linux/usb.h>
so it is worth to make ub_usb_ids also constant.

The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
	struct I1 {
	  ...
	  const struct I2 *x;
	  ...
	};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
	struct I1 y = {
	  .x = E,
	};
@c@
identifier r.I2;
identifier s.E;
@@
	const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+	const
	struct I2 E[] = ...;
// </smpl>

Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-01-11 14:31:26 +01:00
Márton Németh 3d447ec0e3 block: make PCI device id constant
The id_table field of the struct pci_driver is constant in <linux/pci.h>
so it is worth to make the initialization data also constant.

The semantic match that finds this kind of pattern is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
disable decl_init,const_decl_init;
identifier I1, I2, x;
@@
	struct I1 {
	  ...
	  const struct I2 *x;
	  ...
	};
@s@
identifier r.I1, y;
identifier r.x, E;
@@
	struct I1 y = {
	  .x = E,
	};
@c@
identifier r.I2;
identifier s.E;
@@
	const struct I2 E[] = ... ;
@depends on !c@
identifier r.I2;
identifier s.E;
@@
+	const
	struct I2 E[] = ...;
// </smpl>

Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: cocci@diku.dk
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-01-11 14:31:26 +01:00
Lars Ellenberg 36bfc7e210 drbd: check on CONFIG_LBDAF, not LBD
It is called LBDAF since 2.6.31.

impact:
without this change, on 32bit,
DRBD would wrongly claim to only support 2TiB devices.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2010-01-07 14:07:11 +01:00
Julia Lawall 2d1ee87d87 drivers/block/drbd: Correct NULL test
Test the just-allocated value for NULL rather than some other value.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x,y;
statement S;
@@

x = \(kmalloc\|kcalloc\|kzalloc\)(...);
(
if ((x) == NULL) S
|
if (
-   y
+   x
       == NULL)
 S
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2010-01-04 11:51:41 +01:00
Philipp Reisner 367a8d7385 drbd: Silenced an assert that could triggered after changing write ordering method
Immediately after changing the write ordering method, the epoch can already
be finished at this point.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-12-31 09:33:09 +01:00
Johannes Thoma 89f01d5cd3 drbd: Kconfig fix
!CONFIG_OPT evalues to FALSE if CONFIG_OPT='m'. Do not display the
"DRBD disabled..." message if the dependencies are compiled as module.

Signed-off-by: Johannes Thoma <johannes.thoma@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2009-12-29 17:38:28 +01:00
Philipp Reisner 0a6dbf2bc4 drbd: Fix for a race between IO and a detach operation [Bugz 262]
In D_DISKLESS we do not hand out any new references to ldev (local_cnt)
therefore waiting until all previously handed out refereces got returned
is sufficient before actually freeing mdev->ldev.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-12-29 17:36:40 +01:00
Philipp Reisner 0798219f61 drbd: Use drbd_crypto_is_hash() instead of an open coded check
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-12-29 17:35:27 +01:00
Andrew Morton 6ec1480d85 aoe: switch to the new bio_flush_dcache_pages() interface
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ilya Loginov <isloginov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Horton <phorton@bitbox.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-22 09:12:48 +01:00
H Hartley Sweeten e019ef0c4f drivers/block/mg_disk.c: use resource_size()
Use resource_size() for ioremap.

The ioremap appears to be passing the incorrect size for the platform
resource.  Unfortunately, I can't locate a user in mainline to verify
this.  Using resource_size should be the correct fix.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: unsik Kim <donari75@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-22 09:12:48 +01:00
Julia Lawall df9dc83d19 drivers/block/DAC960.c: use DAC960_V2_Controller
DAC960_LP_Controller and DAC960_V2_Controller have the same value, but
elsewhere it is DAC960_V1_Controller or DAC960_V2_Controller that is used
in the FirmwareType field.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-22 09:12:48 +01:00
Jens Axboe 490c560b10 Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-linus 2009-12-21 19:16:38 +01:00
Huang Weiyi 820cd61a28 drbd: remove unused #include <linux/version.h>
Remove unused #include <linux/version.h>('s) in
  drivers/block/drbd/drbd_main.c
  drivers/block/drbd/drbd_receiver.c
  drivers/block/drbd/drbd_worker.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2009-12-21 13:41:16 +01:00
Huang Weiyi 7b886f4f7a drbd: remove duplicated #include
Remove duplicated #include('s) in
  drivers/block/drbd/drbd_worker.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2009-12-21 13:41:11 +01:00
Roel Kluin 49829ea74f drbd: Fix test of unsigned in _drbd_fault_random()
rsp->count is unsigned so the test does not work.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2009-12-21 13:37:29 +01:00
Emese Revfy 7d4e9d0962 drbd: Constify struct file_operations
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2009-12-21 12:45:15 +01:00
Roel Kluin 4a63b030d7 drbd: fix test of unsigned in _drbd_fault_random()
rsp->count is unsigned so the test does not work.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-18 12:38:11 +01:00
Linus Torvalds 51b736b851 Merge branch 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block:
  cfq: set workload as expired if it doesn't have any slice left
  Fix a CFQ crash in "for-2.6.33" branch of block tree
  cfq: Remove wait_request flag when idle time is being deleted
  cfq-iosched: commenting non-obvious initialization
  cfq-iosched: Take care of corner cases of group losing share due to deletion
  cfq-iosched: Get rid of cfqq wait_busy_done flag
  cfq: Optimization for close cooperating queue searching
  block,xd: Delay allocation of DMA buffers until device is known
  drbd: Following the hmac change to SHASH (see linux commit 8bd1209cff)
  cfq-iosched: reduce write depth only if sync was delayed
2009-12-15 09:11:28 -08:00
Arjan van de Ven 2886a8bdfa floppy: Add an extra bound check on ioctl arguments
gcc is not convinced that the floppy.c ioctl has sufficient bound checks:

In function `copy_from_user',
    inlined from `fd_copyin' at drivers/block/floppy.c:3080,
    inlined from `fd_ioctl' at drivers/block/floppy.c:3503:
    arch/x86/include/asm/uaccess_32.h:211:
warning: call to `copy_from_user_overflow' declared with attribute
warning: copy_from_user buffer size is not provably correct

And frankly, as a human I have a hard time proving the same more or less
(the size comes from the ioctl argument.  humpf.  maybe.  the code isn't
very nice)

This patch adds an explicit check to make 100% sure it's safe, better than
finding out later that there indeed was a gap.

[akpm@linux-foundation.org: add WARN_ON()]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:25 -08:00
Alexey Dobriyan 471452104b const: constify remaining dev_pm_ops
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:25 -08:00
Linus Torvalds 09cea96caa Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (151 commits)
  powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
  MAINTAINERS: Add PowerPC patterns
  powerpc/pseries: Track previous CPPR values to correctly EOI interrupts
  powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP
  powerpc: Make "intspec" pointers in irq_host->xlate() const
  powerpc/8xx: DTLB Miss cleanup
  powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
  powerpc/8xx: Start using dcbX instructions in various copy routines
  powerpc/8xx: Restore _PAGE_WRITETHRU
  powerpc/8xx: Add missing Guarded setting in DTLB Error.
  powerpc/8xx: Fixup DAR from buggy dcbX instructions.
  powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
  powerpc/8xx: Invalidate non present TLBs
  powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
  pseries/pseries: Add code to online/offline CPUs of a DLPAR node
  powerpc: stop_this_cpu: remove the cpu from the online map.
  powerpc/pseries: Add kernel based CPU DLPAR handling
  sysfs/cpu: Add probe/release files
  powerpc/pseries: Kernel DLPAR Infrastructure
  ...
2009-12-12 14:27:24 -08:00
Linus Torvalds 11bd04f6f3 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (109 commits)
  PCI: fix coding style issue in pci_save_state()
  PCI: add pci_request_acs
  PCI: fix BUG_ON triggered by logical PCIe root port removal
  PCI: remove ifdefed pci_cleanup_aer_correct_error_status
  PCI: unconditionally clear AER uncorr status register during cleanup
  x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource
  PCI: portdrv: remove redundant definitions
  PCI: portdrv: remove unnecessary struct pcie_port_data
  PCI: portdrv: minor cleanup for pcie_port_device_register
  PCI: portdrv: add missing irq cleanup
  PCI: portdrv: enable device before irq initialization
  PCI: portdrv: cleanup service irqs initialization
  PCI: portdrv: check capabilities first
  PCI: portdrv: move PME capability check
  PCI: portdrv: remove redundant pcie type calculation
  PCI: portdrv: cleanup pcie_device registration
  PCI: portdrv: remove redundant pcie_port_device_probe
  PCI: Always set prefetchable base/limit upper32 registers
  PCI: read-modify-write the pcie device control register when initiating pcie flr
  PCI: show dma_mask bits in /sys
  ...

Fixed up conflicts in:
	arch/x86/kernel/amd_iommu_init.c
	drivers/pci/dmar.c
	drivers/pci/hotplug/acpiphp_glue.c
2009-12-11 12:18:16 -08:00
Linus Torvalds 4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Mel Gorman a3b8d92d25 block,xd: Delay allocation of DMA buffers until device is known
Loading the XD module triggers a warning like

 WARNING: at mm/page_alloc.c:1805
 __alloc_pages_nodemask+0x127/0x48f()
 Hardware name: System Product Name
 Modules linked in:
 Pid: 1, comm: swapper Not tainted 2.6.32-rc8-git5 #1
 Call Trace:
  [<c103d94b>] warn_slowpath_common+0x65/0x95
  [<c103d98d>] warn_slowpath_null+0x12/0x15
  [<c109550c>] __alloc_pages_nodemask+0x127/0x48f
  [<c10be964>] ? get_slab+0x8/0x50
  [<c10b8979>] alloc_page_interleave+0x2e/0x6e
  [<c10b8a10>] alloc_pages_current+0x57/0x99
  [<c2083a4a>] ? xd_init+0x0/0x482
  [<c1094c38>] __get_free_pages+0xd/0x1e
  [<c2083a94>] xd_init+0x4a/0x482
  [<c2082df0>] ? loop_init+0x104/0x16a
  [<c169162d>] ? loop_probe+0x0/0xaf
  [<c2083a4a>] ? xd_init+0x0/0x482
  [<c1001143>] do_one_initcall+0x51/0x13f
  [<c204a307>] kernel_init+0x10b/0x15f
  [<c204a1fc>] ? kernel_init+0x0/0x15f
  [<c1004347>] kernel_thread_helper+0x7/0x10
 ---[ end trace 686db6333ade6e7a ]---
 xd: Out of memory.

The warning is because the alloc_pages is called with an
order >= MAX_ORDER. The simplistic reason is that get_order(0) returns garbage
values when given 0 as a size. The more complex reason is that the XD driver
initialisation is broken.

It's not clear why this ever worked. XD allocates a buffer for DMA based
on the value of xd_maxsectors. This value is determined by the exact
type of controller in use but the value is determined *after* an attempt
has been made to allocate the buffer. i.e. the requested size of the DMA
buffer will always be 0.

This patch alters how XD is initialised slightly by allocating the
buffer when and if a device has actually been detected. The error paths
are updated to suit the new logic.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-09 15:11:03 +01:00
Philipp Reisner 8b43aebdaa drbd: Following the hmac change to SHASH (see linux commit 8bd1209cff)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-12-09 15:11:03 +01:00
Benjamin Herrenschmidt bcd6acd51f Merge commit 'origin/master' into next
Conflicts:
	include/linux/kvm.h
2009-12-09 17:14:38 +11:00
Benjamin Herrenschmidt d58b0c39e3 powerpc/macio: Rework hotplug media bay support
The hotplug mediabay has tendrils deep into drivers/ide code
which makes a libata port reather difficult. In addition it's
ugly and could be done better.

This reworks the interface between the mediabay and the rest
of the world so that:

   - Any macio_driver can now have a mediabay_event callback
which will be called when that driver sits on a mediabay and
it's been either plugged or unplugged. The device type is
passed as an argument. We can now move all the IDE cruft
into the IDE driver itself

   - A check_media_bay() function can be used to take a peek
at the type of device currently in the bay if any, a cleaner
variant of the previous function with the same name.

   - A pair of lock/unlock functions are exposed to allow the
IDE driver to block the hotplug callbacks during the initial
setup and probing of the bay in order to avoid nasty race
conditions.

   - The mediabay code no longer needs to spin on the status
register of the IDE interface when it detects an IDE device,
this is done just fine by the IDE code itself

Overall, less code, simpler, and allows for another driver
than our old drivers/ide based one.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:09:14 +11:00
Jiri Kosina d014d04386 Merge branch 'for-next' into for-linus
Conflicts:

	kernel/irq/chip.c
2009-12-07 18:36:35 +01:00
Adam Buchbinder 6070d81eb5 tree-wide: fix misspelling of "definition" in comments
"Definition" is misspelled "defintion" in several comments; this
patch fixes them. No code changes.

Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 23:41:47 +01:00
Philipp Reisner 753c89130c drbd_req.c: use part_[inc|dec]_in_flight()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-12-03 17:40:51 +01:00
Jens Axboe 220d0b1dbf Merge branch 'master' into for-2.6.33 2009-12-03 13:49:39 +01:00
Peter Horton 0a1f127a05 aoe: prevent cache aliases
Prevent the AoE block driver from creating cache aliases of page cache
pages on machines with virtually indexed caches.

Building kernels on an AT91SAM9G20 board without this patch fails with
segmentation faults after a couple of passes.

Signed-off-by: Peter Horton <zero@colonel-panic.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-01 16:32:20 -08:00
Philipp Reisner d8c2a36b77 Fixed a regression in resync decission code drbd_uuid_compare() [Bugz 260]
Since 8.3.3 we fail to do the resync when a partial resynch is not
possible, but a full synch is necessary.

This regression was introduced with 7101539930c0a89146959e7a39c09ad9c3516434

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-24 18:13:28 +01:00
Lars Ellenberg 0b33a9164a add missing state change on corrupt packet header in drbd_recv_header
Otherwise the 'state fixup' in the receiver will change to Unconnected,
but the receiver will terminate itself, and any attempt at 'down'ing
that drbd later will block forever.

see also Bugz. #259

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-24 18:12:13 +01:00
Lars Ellenberg 6c6c7951be fix in-kernel configuration serialization
this is uncritical, as we still also serialize in userland,
but to correctly serialize on the CONFIG_PENDING bit,
it must be wait_event(state_wait, \!test_and_set_bit)

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-24 18:11:05 +01:00
Alex Chiang 32a87c0114 cciss: change Cmd_sg_list.sg_chain_dma type to dma_addr_t
A recent commit broke the ia64 build:

	Author: Don Brace <brace@beardog.cce.hp.com>
	Date:   Thu Nov 12 12:50:01 2009 -0600

	cciss: Add enhanced scatter-gather support.

because of this hunk:

	--- a/drivers/block/cciss.h
	+++ b/drivers/block/cciss.h
	+struct Cmd_sg_list {
	+       SGDescriptor_struct     *sgchain;
	+       dma64_addr_t            sg_chain_dma;
	+       int                     chain_block_size;
	+};

The issue is that dma64_addr_t isn't #define'd on ia64.

The way that we're using Cmd_sg_list.sg_chain_dma is to hold an
address returned from pci_map_single().

	+               temp64.val = pci_map_single(h->pdev,
	+                                 h->cmd_sg_list[c->cmdindex]->sgchain,
	+                                 len, dir);
	+
	+               h->cmd_sg_list[c->cmdindex]->sg_chain_dma = temp64.val;

pci_map_single() returns a dma_addr_t too.

This code will still work even on a 32-bit x86 build, where
dma_addr_t is defined to be a u32 because it will simply be
promoted to the __u64 that temp64.val is defined as.

Thus, declaring Cmd_sg_list.sg_chain_dma as dma_addr_t is safe.

Cc: Don Brace <brace@beardog.cce.hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-23 09:35:06 +01:00
Stephen M. Cameron d61c42690c cciss: fix scatter gather cleanup problems
On driver unload, only free up the extra scatter gather data if they were
allocated in the first place (the controller supports it) and don't forget
to free up the sg_cmd_list array of pointers.

Signed-off-by: Don Brace <brace@beardog.cce.hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-23 09:31:48 +01:00
Alex Chiang 69ac748222 cciss: make device attrs static
No need to export those device attributes.

In fact, without this patch, we can trip over a build error if cciss
is a built-in and another driver also declares and exports attributes
with the same name.

You'll see errors like:

	drivers/scsi/built-in.o: multiple definition of `dev_attr_lunid'
	drivers/block/built-in.o: first defined here

Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Cc: <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:47:53 +01:00
Stephen M. Cameron 8721c81f64 cciss: Fix weird usage of ENXIO in cciss_scsi.c
cciss: Fix weird usage of ENXIO in cciss_scsi.c

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:54 +01:00
Don Brace 5c07a311a8 cciss: Add enhanced scatter-gather support.
cciss: Add enhanced scatter-gather support.  For controllers which
supported, more than 512 scatter-gather elements per command may
be used, and the max transfer size can be increased to 8192 blocks.

Signed-off-by: Don Brace <brace@beardog.cce.hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:54 +01:00
Stephen M. Cameron da0021841c cciss: Do not automatically rescan on UNIT ATTENTION/LUN DATA CHANGED
cciss: Do not automatically rescan on UNIT ATTENTION/LUN DATA CHANGED
There are problems with doing this.  If, say, several logical drives
are deleted at once, several such UNIT ATTENTIONS will be encountered,
often during the rescan triggered by the first such UNIT ATTENTION.
The block layer may be in the midst of trying to add logical drives
which were just deleted (resulting in the subsequent UNIT ATTENTION(s).)
Making the rescan code robust enough to tolerate this kind of thing
is too complicated for the moment.  So, for now, we just don't do it.
Note: This UNIT ATTENTION/LUN DATA CHANGED situation only occurs on
the MSA2012.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Stephen M. Cameron d06dfbd236 cciss: Remove unnecessary check in scan_thread
cciss: Remove unnecessary check in scan_thread

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Stephen M. Cameron b0e15f6db1 cciss: fix typo that causes scsi status to be lost.
cciss: fix typo that causes scsi status to be lost.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Stephen M. Cameron aa43f11147 cciss: remove sendcmd() as it is no longer used.
cciss: remove sendcmd() as it is no longer used.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Stephen M. Cameron 29009a036f cciss: clean up code in cciss_shutdown
cciss: clean up code in cciss_shutdown.  Send the flush cache
command down with interrupts still enabled, and do not do DMA
from the stack.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Stephen M. Cameron 7b838bde92 cciss: Remove the "withirq" parameter from various functions where possible
cciss:  Remove the "withirq" parameter from various functions where possible

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Stephen M. Cameron c08fac6500 cciss: Retry driver initiated cmds with unit attention condition
cciss:  Retry driver initiated cmds with unit attention condition

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Stephen M. Cameron fd8489cff4 cciss: Fix problem with remove_from_scan_list on driver unload
cciss: Fix problem with remove_from_scan_list that on driver unload
it doesn't remove the controller from the scan list correctly if
the controller is currently being scanned for new devices.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:53 +01:00
Alex Chiang 8ba95c69fe cciss: Make device attributes static
cciss: Make device attributes static

Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-13 08:45:52 +01:00
Jiri Kosina e0c0978699 ataflop: remove buggy/commented-out IRQ disable from do_fd_request()
There is a nice gem in drivers/block/ataflop.c::do_fd_request()

      void do_fd_request(struct request_queue * q)
      {
              unsigned long flags;

              DPRINT(("do_fd_request for pid %d\n",current->pid));
              while( fdc_busy ) sleep_on( &fdc_wait );
              fdc_busy = 1;
              stdma_lock(floppy_irq, NULL);

              atari_disable_irq( IRQ_MFP_FDC );
              local_save_flags(flags);        /* The request function is called with ints
              local_irq_disable();             * disabled... so must save the IPL for later */
              redo_fd_request();
              local_irq_restore(flags);
              atari_enable_irq( IRQ_MFP_FDC );
      }

If you look at the code long enough, you will notioce that the
local_irq_disable() call is actually commented out. This has been
introduced back in 2002 in [1], but as you can see, the same bug has been
there even before, with the sti() call being commented out in the very
same way :)

I am not familiar with the code myself at all, but I guess that the whole
stuff can just be removed. Why do we need save_flags/restore_flags at all,
without actually disabling the local IRQs afterwards? The
redo_fd_request() doesn't seem to do anything that would mess with flags
inconsistently.

[1] http://lkml.org/lkml/2002/12/27/58

Jens:
That does look odd. The comment is correct that the function is entered
with interrupts disabled (and the queue lock held). So I'd say your
patch looks fine, the whole save/restore business looks meaningless.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Michael Schmitz <schmitz@biophys.uni-duesseldorf.de>
2009-11-09 09:40:57 +01:00
Jens Axboe 622d32d3ec Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-2.6.33 2009-11-04 18:38:23 +01:00
Jeremy Fitzhardinge 1ccbf5344c xen: move Xen-testing predicates to common header
Move xen_domain and related tests out of asm-x86 to xen/xen.h so they
can be included whenever they are necessary.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-11-04 08:47:24 -08:00
Lars Ellenberg 83c38830b0 drbd: performance - don't lose unplug events
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-04 15:21:04 +01:00
Philipp Reisner e656ec8ae2 Do not deadlock in drbd_disconnect() [bugz 258]
When there are many blocks on the fly (ua), and the AL gets into "starving"
mode (random IO, scattered all over the device), and the connections gets
interrupted, the receiver thread deadlocks in the drbd_disconnect() code path.

Affected are only nodes in Primary role.

The bug triggers most likely on system that mirror over "long distances"

Regression introduced shortly before 8.3.3
with git commit 31e0f1250f174ac1ee317f360943a0159e19edc8

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-04 15:21:03 +01:00
Philipp Reisner 0a49216625 drbdsetup X resume-io should be usable to resume IO [Bugz 256]
When IO gets frozen due to a broken fence-peer script, the user
should be able to thaw IO by the resume-io command.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-04 15:21:01 +01:00
Lars Ellenberg 1352994b36 drbd: fix check for too large lower level device
To check wether we are truncating a very large device due to limited
meta data space, we need to check the ll_dev size.

Also improve the printk to suggest "flexible" or "internal".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-04 15:21:00 +01:00
Lars Ellenberg ad19bf6e54 fix grammar in printk
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-11-04 15:20:59 +01:00
Hideyuki Sasaki f21121cde6 block/ps3: fix slow VRAM IO
The current PS3 VRAM driver uses msleep() to wait for completion of RSX
DMA transfers between system memory and VRAM.  Depending on the system
timing, the processing delay and overhead of this msleep() call can
significantly impact VRAM driver IO.

To avoid the condition, add a short duration (200 usec max) udelay()
polling loop before entering the msleep() polling loop.

Signed-off-by: Hideyuki Sasaki <xhide@rd.scei.sony.co.jp>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Acked-by: Jim Paris <jim@jtan.com>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-04 09:09:28 +01:00
Jens Axboe 2058297d2d Merge branch 'for-linus' into for-2.6.33
Conflicts:
	block/cfq-iosched.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-03 21:14:39 +01:00
Alexey Dobriyan cf6e693212 loop: fix NULL dereference if mount fails
Commit bb21488482 ("[PATCH] switch loop")
started to pass NULL bdev to ioctl hook.

Steps to reproduce:

	[boot with loop.max_part=1]
	[mount -o loop something so mount fails]

BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
IP: [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:35/ACPI0003:00/power_supply/ACAD/online
CPU 0
Modules linked in: zfs nvidia(P) [last unloaded: zfs]
Pid: 15177, comm: mount Tainted: P           2.6.32-rc4-zfs #2 Satellite X200
RIP: 0010:[<ffffffff811486ee>]  [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
RSP: 0018:ffff88003b3d5bb8  EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 000000000000125f RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88003b3d5ce8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00007ffffffff000
R13: 0000000000000000 R14: ffff880071cef280 R15: 00000000000200da
FS:  00007fd77cfe7740(0000) GS:ffff880001600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000b8 CR3: 0000000001001000 CR4: 00000000000026f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process mount (pid: 15177, threadinfo ffff88003b3d4000, task ffff88007572f920)
Stack:
 ffff88003b3d5c38 ffffffff812f95f5 ffff88007eeb6600 0000000000000000
<0> 0000000000000000 ffff88003b3d5c18 ffffffff811547d9 ffff88001bf11ef0
<0> 7fffffffffffffff ffff88001bf11ee8 ffff88001bf11ef0 0000000000000000
Call Trace:
 [<ffffffff812f95f5>] ? schedule_timeout+0x1f5/0x250
 [<ffffffff811547d9>] ? rb_insert_color+0x109/0x140
 [<ffffffff812fb754>] ? _spin_unlock_irq+0x14/0x40
 [<ffffffff812f84c6>] ? wait_for_common+0x66/0x170
 [<ffffffff8105a280>] ? default_wake_function+0x0/0x10
 [<ffffffff810f8258>] ioctl_by_bdev+0x38/0x50
 [<ffffffff811d2481>] loop_clr_fd+0x1e1/0x210
 [<ffffffff811d2522>] lo_release+0x72/0x80
 [<ffffffff810f934c>] __blkdev_put+0x1ac/0x1d0
 [<ffffffff810f937b>] blkdev_put+0xb/0x10
 [<ffffffff810f93b9>] blkdev_close+0x39/0x60
 [<ffffffff810ccef3>] __fput+0xd3/0x230
 [<ffffffff810cd06d>] fput+0x1d/0x30
 [<ffffffff810c9680>] filp_close+0x50/0x80
 [<ffffffff81061f11>] put_files_struct+0x81/0x100
 [<ffffffff81061fde>] exit_files+0x4e/0x60
 [<ffffffff81063ec5>] do_exit+0x6b5/0x730
 [<ffffffff8107b279>] ? up_read+0x9/0x10
 [<ffffffff8104c86e>] ? do_page_fault+0x18e/0x2a0
 [<ffffffff81063f81>] do_group_exit+0x41/0xc0
 [<ffffffff81064012>] sys_exit_group+0x12/0x20
 [<ffffffff81030deb>] system_call_fastpath+0x16/0x1b
Code: f8 48 89 e5 48 81 ec 30 01 00 00 48 89 5d d8 4c 89 6d e8 4c 89 65 e0 4c 89 75 f0 4c 89 7d f8 48 89 bd e8 fe ff ff 49 89 cd 89 f3 <49> 8b 88 b8 00 00 00 81 fa 68 12 00 00 0f 84 57 05 00 00 0f 86
RIP  [<ffffffff811486ee>] blkdev_ioctl+0x2e/0xa30
 RSP <ffff88003b3d5bb8>
CR2: 00000000000000b8
---[ end trace c0b4d3c3118d1427 ]---
Fixing recursive fault but reboot is needed!

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:27 -07:00
Jens Axboe a870a3a485 drbd: fix in_flight rw indexing
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-28 09:30:27 +01:00
Rusty Russell 3225beaba0 virtio_blk: Revert serial number support
This reverts "Add serial number support for virtio_blk, V4a".

Turns out that virtio_pci, lguest and s/390 all have an 8 bit limit
on virtio config space, so noone could ever use this.

This is coming back later in a cleaner form.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: john cooper <john.cooper@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
2009-10-22 16:39:30 +10:30
Christian Borntraeger e95646c3ec virtio: let header files include virtio_ids.h
Rusty,

commit 3ca4f5ca73
    virtio: add virtio IDs file
moved all device IDs into a single file. While the change itself is
a very good one, it can break userspace applications. For example
if a userspace tool wanted to get the ID of virtio_net it used to
include virtio_net.h. This does no longer work, since virtio_net.h
does not include virtio_ids.h.
This patch moves all "#include <linux/virtio_ids.h>" from the C
files into the header files, making the header files compatible with
the old ones.

In addition, this patch exports virtio_ids.h to userspace.

CC: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 16:39:28 +10:30
Christoph Hellwig f8b12e513b virtio_blk: revert QUEUE_FLAG_VIRT addition
It seems like the addition of QUEUE_FLAG_VIRT caueses major performance
regressions for Fedora users:

	https://bugzilla.redhat.com/show_bug.cgi?id=509383
	https://bugzilla.redhat.com/show_bug.cgi?id=505695

while I can't reproduce those extreme regressions myself I think the flag
is wrong.

Rationale:

  QUEUE_FLAG_VIRT expands to QUEUE_FLAG_NONROT which casus the queue
  unplugged immediately.  This is not a good behaviour for at least
  qemu and kvm where we do have significant overhead for every
  I/O operations.  Even with all the latested speeups (native AIO,
  MSI support, zero copy) we can only get native speed for up to 128kb
  I/O requests we already are down to 66% of native performance for 4kb
  requests even on my laptop running the Intel X25-M SSD for which the
  QUEUE_FLAG_NONROT was designed.
  If we ever get virtio-blk overhead low enough that this flag makes
  sense it should only be set based on a feature flag set by the host.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 16:39:26 +10:30
Jens Axboe c30f33437c Merge branch 'for-linus' into for-2.6.33 2009-10-13 12:29:45 +02:00
Stephen M. Cameron 2ec24ff1d1 cciss: Add cciss_allow_hpsa module parameter
Add cciss_allow_hpsa module parameter.  This parameter causes
the cciss driver to ignore any Smart Array devices known to be
supported by the hpsa driver.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-13 09:18:22 +02:00
Stephen M. Cameron 2cfa948c9e cciss: Fix multiple calls to pci_release_regions
Fix multiple calls to pci_release_regions.  If cciss_pci_init
fails, it already does any necessary call to pci_release_regions,
so this does not need to be done again in cciss_init_one in that
case.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-13 09:18:22 +02:00
Randy Dunlap 132cc538cd drbd: needs __ratelimit()
drbd_int.h uses __ratelimit(), so it needs to #include ratelimit.h:

drivers/block/drbd/drbd_int.h:1765: error: implicit declaration of function '__ratelimit'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: drbd-dev@lists.linbit.com
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-07 19:26:00 +02:00
Philipp Reisner 9f5180e5c3 drbd: Work on permission enforcement
Now we have the capabilities of the sending process available,
use them to enforce CAP_SYS_ADMIN.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-06 09:30:14 +02:00
Jens Axboe 25d2d4edfa drbd: fixup for reverted dual in_flight patch
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-05 09:31:59 +02:00
Jens Axboe 5d13379a4d Merge branch 'master' into for-2.6.33 2009-10-05 09:30:10 +02:00
Linus Torvalds 58e57fbd1c Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (41 commits)
  Revert "Seperate read and write statistics of in_flight requests"
  cfq-iosched: don't delay async queue if it hasn't dispatched at all
  block: Topology ioctls
  cfq-iosched: use assigned slice sync value, not default
  cfq-iosched: rename 'desktop' sysfs entry to 'low_latency'
  cfq-iosched: implement slower async initiate and queue ramp up
  cfq-iosched: delay async IO dispatch, if sync IO was just done
  cfq-iosched: add a knob for desktop interactiveness
  Add a tracepoint for block request remapping
  block: allow large discard requests
  block: use normal I/O path for discard requests
  swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL
  fs/bio.c: move EXPORT* macros to line after function
  Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfs
  cciss: fix build when !PROC_FS
  block: Do not clamp max_hw_sectors for stacking devices
  block: Set max_sectors correctly for stacking devices
  cciss: cciss_host_attr_groups should be const
  cciss: Dynamically allocate the drive_info_struct for each logical drive.
  cciss: Add usage_count attribute to each logical drive in /sys
  ...
2009-10-04 12:39:14 -07:00
Alexey Dobriyan 828c09509b const: constify remaining file_operations
[akpm@linux-foundation.org: fix KVM]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-01 16:11:11 -07:00
Jens Axboe 6a0afdf58d drbd: remove tracing bits
They should be reimplemented in the current scheme.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:17:58 +02:00
Lars Ellenberg ab8fafc2e1 dropping unneeded include autoconf.h
It is force-included on the gcc command line since at least 2.6.15.
Explicit include lines seem to break compilation now in certain configurations.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2009-10-01 21:17:54 +02:00
Philipp Reisner b411b3637f The DRBD driver
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2009-10-01 21:17:49 +02:00
Alexander Beregalov 1e6f2dc119 cciss: fix build when !PROC_FS
Fix these build errors when CONFIG_PROC_FS is not set:
drivers/block/cciss.c: In function 'cciss_show_raid_level':
drivers/block/cciss.c:623: error: 'RAID_UNKNOWN' undeclared (first use in this function)
drivers/block/cciss.c:626: error: 'raid_label' undeclared (first use in this function)
drivers/block/cciss.c: In function 'cciss_geometry_inquiry':
drivers/block/cciss.c:2696: error: 'RAID_UNKNOWN' undeclared (first use in this function)

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:45 +02:00
Jens Axboe 9f792d9f58 cciss: cciss_host_attr_groups should be const
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:45 +02:00
Stephen M. Cameron 9cef0d2f4f cciss: Dynamically allocate the drive_info_struct for each logical drive.
cciss: Dynamically allocate the drive_info_struct for each logical drive.
This reduces the size of the per-hba ctlr_info structure from 106936
bytes to 8132 bytes.  That's on 32-bit systems.  On 64-bit systems, the
improvement is even bigger.  Without this, the ctlr_info struct is so big
that the driver won't even load on a 64 bit system if CISS_MAX_LUN was
at it's current setting of 1024 logical drives.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:45 +02:00
Stephen M. Cameron e272afecaf cciss: Add usage_count attribute to each logical drive in /sys
Add usage_count attribute to each logical drive at
/sys/devices/<dev>/ccissX/cXdY/usage_count for controller X,
logical drive Y.  The usage count is the number of times
the device has currently been opened.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:44 +02:00
Stephen M. Cameron 3ff1111dc6 cciss: Add a "raid_level" attribute to each logical drive in /sys
and change get rid of some magic numbers in raid lavel decoding.

Add raid_level attribute to each logical drive at
/sys/devices/<dev>/ccissX/cXdY/raid_level for controller X,
logical drive Y

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:44 +02:00
Stephen M. Cameron fa52bec9df cciss: fix some magic numbers in the raid-level decoding
cciss: fix some magic numbers in the raid-level decoding

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:44 +02:00
Stephen M. Cameron ce84a8aeac cciss: Add lunid attribute to each logical drive in /sys
Add lunid attribute to each logical drive at
/sys/devices/<dev>/ccissX/cXdY/lunid for controller X,
logical drive Y

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:44 +02:00
Stephen M. Cameron 2e043986d5 cciss: Don't check h->busy_initializing in cciss_open().
Don't check h->busy_initializing in cciss_open().  Open won't be
called before things are ready, but h->busy_initializing won't be
unset until after the initial rebuild_lun_table is finished.  But,
to read the partitions, cciss_open will be called for each logical
drive during rebuild_lun_table.  If cciss_open checks h->busy_initializing,
then the reading of the partition information during the initial
rebuild_lun_table will fail, which is especially bad news if it
happens to be your boot device.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:43 +02:00
Stephen M. Cameron 39ccf9a645 cciss: Preserve all 8 bytes of LUN ID for logical drives.
Preserve all 8 bytes of the LunID field returned
by CCISS_REPORT_LOGICAL instead of only saving 4 bytes.
This fixes a bug with logical volume addressing encountered on
an MSA2012.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:43 +02:00
Stephen M. Cameron 983333cb0c cciss: Silence noisy per-disk messages output by cciss_read_capacity
Silence noisy per-disk messages output by cciss_read_capacity

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:43 +02:00
Stephen M. Cameron 2c935593ac cciss: Fix excessive gendisk freeing bug on driver unload.
Fix bug that free_hba was calling put_disk for all gendisk[]
pointers -- all 1024 of them -- regardless of whether the were
used or not (NULL).  This bug could cause rmmod to oops if logical
drives had been deleted during the driver's lifetime.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:43 +02:00
Stephen M. Cameron 2d11d9931f cciss: Fix usage_count check in rebuild_lun_table when triggered via sysfs.
When rebuild_lun_table is reached via sysfs, the usage count that
is checked prior to messing with c0d0 has different constraints
(must be zero) than if rebuild_lun_table is reached via ioctl
(must be one.)  Fix rebuild_lun_table to take that into account.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:42 +02:00
Stephen M. Cameron 9ddb27b44f cciss: Clear all sysfs-exposed data for deleted logical drives.
When removing a logical drive, clear all the information that is
now exposed by sysfs (e.g. vendor, model, serial number.)

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:42 +02:00
Stephen M. Cameron 8ce51966d3 cciss: Handle special case for sysfs attributes of the first logical drive.
For c0dx where x is not 0, we handle deletion and addition simply,
but for c0d0, there is the special case that even when there's no
disk, the device node exists so that the controller may be accessed.
So, for c0d0, we only create the sysfs entries once, when a controller
is added, and only remove them once, when a controller is being
taken down.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:42 +02:00
Stephen M. Cameron 361e9b07d1 cciss: Handle cases when cciss_add_disk fails.
Handle cases when cciss_add_disk fails.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:42 +02:00
Stephen M. Cameron e8074f7977 cciss: Handle failure of blk_init_queue gracefully in cciss_add_disk.
Handle failure of blk_init_queue gracefully in cciss_add_disk.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:42 +02:00
Stephen M. Cameron 097d026453 cciss: Rearrange logical drive sysfs code to make the "changing a disk" path work.
Rearrange logical drive sysfs code to make the "changing a disk" path work.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:41 +02:00
Stephen M. Cameron 617e134422 cciss: Dynamically allocate struct device for each logical drive as needed.
Dynamically allocate struct device for each logical drive as needed
instead of allocating the maximum we would ever need at driver init time.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:41 +02:00
Stephen M. Cameron 21d9db0b62 cciss: Remove some unused code in rebuild_lun_table()
Remove some unused code in rebuild_lun_table()

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:41 +02:00
Andrew Patterson d6f4965d7d cciss: Allow triggering of rescan of logical drive topology via sysfs entry
Added /sys/bus/pci/devices/<dev>/ccissX/rescan sysfs entry used
to kick off a rescan that discovers logical drive topology changes.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:41 +02:00
Andrew Patterson b368c9dd65 cciss: Use one scan thread per controller and fix hang during rmmod
Replace the use of one scan kthread per controller with one per driver.
Use a queue to hold a list of controllers that need to be rescanned with
routines to add and remove controllers from the queue.

Fix locking and completion handling to prevent a hang during rmmod.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:41 +02:00
Andrew Patterson c64bebcd7f cciss: Remove sysfs entries for logical drives on driver cleanup.
Sysfs entries for logical drives need to be removed when a drive is
deleted during driver cleanup.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:40 +02:00
Randy Dunlap 4d76160947 cciss: fix schedule_timeout() parameters
Change schedule_timeout() parameter to not be specific to HZ=1000.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Cc: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: "Cameron, Steve" <Steve.Cameron@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:40 +02:00
Alexey Dobriyan d5d03eec9b dac960: switch to seq_file
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Yang Hongyang <yanghy@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:40 +02:00
Alexey Dobriyan ff2c3de305 cpqarray: switch to seq_file
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Chirag Kantharia <chirag.kantharia@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:15:40 +02:00
Linus Torvalds 1f0918d03f Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: don't force VIRTIO_F_NOTIFY_ON_EMPTY
  lguest: cleanup for map_switcher()
  lguest: use PGDIR_SHIFT for PAE code to allow different PAGE_OFFSET
  lguest: use set_pte/set_pmd uniformly for real page table entries
  lguest: move panic notifier registration to its expected place.
  virtio_blk: add support for cache flush
  virtio: add virtio IDs file
  virtio: get rid of redundant VIRTIO_ID_9P definition
  virtio: make add_buf return capacity remaining
  virtio_pci: minor MSI-X cleanups
2009-09-23 09:23:45 -07:00
James Morris 88e9d34c72 seq_file: constify seq_operations
Make all seq_operations structs const, to help mitigate against
revectoring user-triggerable function pointers.

This is derived from the grsecurity patch, although generated from scratch
because it's simpler than extracting the changes from there.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 07:39:29 -07:00
Michael Buesch e898893399 dac960: fix undefined behavior on empty string
Fix undefined behavior due to a buffer underrun if an empty string is
written to the proc file.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 07:39:28 -07:00
Christoph Hellwig f1b0ef0626 virtio_blk: add support for cache flush
Recent qemu has added a VIRTIO_BLK_F_FLUSH flag to advertise that the
virtual disk has a volatile write cache that needs to be flushed.  In case
we see this feature implement tell the Linux block layer about the fact
and use the new VIRTIO_BLK_T_FLUSH to flush the cache when required.  This
allows for an correct and simple implementation of write barriers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-23 22:26:36 +09:30
Fernando Luis Vazquez Cao 3ca4f5ca73 virtio: add virtio IDs file
Virtio IDs are spread all over the tree which makes assigning new IDs
bothersome. Putting them together should make the process less error-prone.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-23 22:26:32 +09:30
Rusty Russell 3c1b27d504 virtio: make add_buf return capacity remaining
This API change means that virtio_net can tell how much capacity
remains for buffers.  It's necessarily fuzzy, since
VIRTIO_RING_F_INDIRECT_DESC means we can fit any number of descriptors
in one, *if* we can kmalloc.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dinesh Subhraveti <dineshs@us.ibm.com>
2009-09-23 22:26:31 +09:30
Linus Torvalds 342ff1a1b5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
  trivial: fix typo in aic7xxx comment
  trivial: fix comment typo in drivers/ata/pata_hpt37x.c
  trivial: typo in kernel-parameters.txt
  trivial: fix typo in tracing documentation
  trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
  trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
  trivial: remove unnecessary semicolons
  trivial: Fix duplicated word "options" in comment
  trivial: kbuild: remove extraneous blank line after declaration of usage()
  trivial: improve help text for mm debug config options
  trivial: doc: hpfall: accept disk device to unload as argument
  trivial: doc: hpfall: reduce risk that hpfall can do harm
  trivial: SubmittingPatches: Fix reference to renumbered step
  trivial: fix typos "man[ae]g?ment" -> "management"
  trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
  trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
  trivial: fix missing printk space in amd_k7_smp_check
  trivial: fix typo s/ketymap/keymap/ in comment
  trivial: fix typo "to to" in multiple files
  trivial: fix typos in comments s/DGBU/DBGU/
  ...
2009-09-22 07:51:45 -07:00
Alexey Dobriyan 83d5cde47d const: make block_device_operations const
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22 07:17:25 -07:00
Joe Perches a419aef8b8 trivial: remove unnecessary semicolons
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-21 15:14:58 +02:00
Peter Huewe 3c36543aea trivial: add __init/__exit macros to DAC960.c
Trivial patch which adds the __init and __exit macros to the module_init /
module_exit functions from drivers/block/DAC960.c

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-21 15:14:52 +02:00
Kay Sievers e454cea20b Driver-Core: extend devnode callbacks to provide permissions
This allows subsytems to provide devtmpfs with non-default permissions
for the device node. Instead of the default mode of 0600, null, zero,
random, urandom, full, tty, ptmx now have a mode of 0666, which allows
non-privileged processes to access standard device nodes in case no
other userspace process applies the expected permissions.

This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-19 12:50:38 -07:00
Linus Torvalds ab86e5765d Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev
  debugfs: Modify default debugfs directory for debugging pktcdvd.
  debugfs: Modified default dir of debugfs for debugging UHCI.
  debugfs: Change debugfs directory of IWMC3200
  debugfs: Change debuhgfs directory of trace-events-sample.h
  debugfs: Fix mount directory of debugfs by default in events.txt
  hpilo: add poll f_op
  hpilo: add interrupt handler
  hpilo: staging for interrupt handling
  driver core: platform_device_add_data(): use kmemdup()
  Driver core: Add support for compatibility classes
  uio: add generic driver for PCI 2.3 devices
  driver-core: move dma-coherent.c from kernel to driver/base
  mem_class: fix bug
  mem_class: use minor as index instead of searching the array
  driver model: constify attribute groups
  UIO: remove 'default n' from Kconfig
  Driver core: Add accessor for device platform data
  Driver core: move dev_get/set_drvdata to drivers/base/dd.c
  Driver core: add new device to bus's list before probing
2009-09-16 08:27:10 -07:00
Linus Torvalds 723e9db7a4 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (134 commits)
  powerpc/nvram: Enable use Generic NVRAM driver for different size chips
  powerpc/iseries: Fix oops reading from /proc/iSeries/mf/*/cmdline
  powerpc/ps3: Workaround for flash memory I/O error
  powerpc/booke: Don't set DABR on 64-bit BookE, use DAC1 instead
  powerpc/perf_counters: Reduce stack usage of power_check_constraints
  powerpc: Fix bug where perf_counters breaks oprofile
  powerpc/85xx: Fix SMP compile error and allow NULL for smp_ops
  powerpc/irq: Improve nanodoc
  powerpc: Fix some late PowerMac G5 with PCIe ATI graphics
  powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT
  powerpc/book3e: Add missing page sizes
  powerpc/pseries: Fix to handle slb resize across migration
  powerpc/powermac: Thermal control turns system off too eagerly
  powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan()
  powerpc/405ex: support cuImage via included dtb
  powerpc/405ex: provide necessary fixup function to support cuImage
  powerpc/40x: Add support for the ESTeem 195E (PPC405EP) SBC
  powerpc/44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support.
  powerpc/44x: Update Arches defconfig
  powerpc/44x: Update Arches dts
  ...

Fix up conflicts in drivers/char/agp/uninorth-agp.c
2009-09-15 09:51:09 -07:00
GeunSik Lim ea5ffff57d debugfs: Modify default debugfs directory for debugging pktcdvd.
As we all know, We need change default directory for consistency of
debugfs by Greg K-H

Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-15 09:50:49 -07:00
David Brownell a4dbd6740d driver model: constify attribute groups
Let attribute group vectors be declared "const".  We'd
like to let most attribute metadata live in read-only
sections... this is a start.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-15 09:50:47 -07:00
Linus Torvalds f86054c245 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (23 commits)
  at_hdmac: Rework suspend_late()/resume_early()
  PM: Reset transition_started at dpm_resume_noirq
  PM: Update kerneldoc comments in drivers/base/power/main.c
  PM: Add convenience macro to make switching to dev_pm_ops less error-prone
  hp-wmi: Switch driver to dev_pm_ops
  floppy: Switch driver to dev_pm_ops
  PM: Trivial fixes
  PM / Hibernate / Memory hotplug: Always use for_each_populated_zone()
  PM/Hibernate: Do not try to allocate too much memory too hard (rev. 2)
  PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2)
  PM/Hibernate: Rework shrinking of memory
  PM: Fix typo in label name s/Platofrm_finish/Platform_finish/
  PM: Run-time PM platform device bus support
  PM: Introduce core framework for run-time PM of I/O devices (rev. 17)
  Driver Core: Make PM operations a const pointer
  PM: Remove platform device suspend_late()/resume_early() V2
  USB: Rework musb suspend()/resume_early()
  I2C: Rework i2c-s3c2410 suspend_late()/resume() V2
  I2C: Rework i2c-pxa suspend_late()/resume_early()
  DMA: Rework txx9dmac suspend_late()/resume_early()
  ...

Fix trivial conflict in drivers/base/platform.c (due to same
constification patch being merged in both sides, along with some other
PM work in the PM branch)
2009-09-14 20:03:54 -07:00
Linus Torvalds 355bbd8cb8 Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits)
  block: use blkdev_issue_discard in blk_ioctl_discard
  Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads
  block: don't assume device has a request list backing in nr_requests store
  block: Optimal I/O limit wrapper
  cfq: choose a new next_req when a request is dispatched
  Seperate read and write statistics of in_flight requests
  aoe: end barrier bios with EOPNOTSUPP
  block: trace bio queueing trial only when it occurs
  block: enable rq CPU completion affinity by default
  cfq: fix the log message after dispatched a request
  block: use printk_once
  cciss: memory leak in cciss_init_one()
  splice: update mtime and atime on files
  block: make blk_iopoll_prep_sched() follow normal 0/1 return convention
  cfq-iosched: get rid of must_alloc flag
  block: use interrupts disabled version of raise_softirq_irqoff()
  block: fix comment in blk-iopoll.c
  block: adjust default budget for blk-iopoll
  block: fix long lines in block/blk-iopoll.c
  block: add blk-iopoll, a NAPI like approach for block devices
  ...
2009-09-14 17:55:15 -07:00
Frans Pop c90cd332d3 floppy: Switch driver to dev_pm_ops
Gets rid of the following warning:
Platform driver 'floppy' needs updating - please use dev_pm_ops

[rjw: Fixed up the definition of floppy_pm_ops.]

Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2009-09-14 20:26:59 +02:00
Ed Cashin 18d8217bc4 aoe: end barrier bios with EOPNOTSUPP
BugLink: http://bugzilla.kernel.org/show_bug.cgi?id=13942

Bruno Premont noticed that aoe throws a BUG during umount of an XFS in
2.6.31:

[ 5259.349897] aoe: bi_io_vec is NULL
[ 5259.349940] ------------[ cut here ]------------
[ 5259.349958] kernel BUG at /usr/src/linux-2.6/drivers/block/aoe/aoeblk.c:177!
[ 5259.349990] invalid opcode: 0000 [#1]

The bio in question is a barrier.  Jens Axboe suggested that such bios
need to be recognized and ended with -EOPNOTSUPP by any driver that
provides its own ->make_request_fn handler and does not handle
barriers.

In testing the changes below eliminate the BUG.

(Better would be real barrier support, something that Ed says he'll add
for later in the .32 cycle. For now, this at least gets rid of a bug
with crashing on an empty barrier. Jens)

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:52 +02:00
Marcin Slusarz 49b3a3cbc0 block: use printk_once
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tim Waugh <tim@cyberelk.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 14:34:33 +02:00
Eric Dumazet 212a502676 cciss: memory leak in cciss_init_one()
commit 22bece00dc
(cciss: fix regression firmware not displayed in procfs)
added a small memory leak in cciss_init_one()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 14:34:33 +02:00
Jens Axboe 1f98a13f62 bio: first step in sanitizing the bio->bi_rw flag testing
Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 14:33:31 +02:00
Jens Axboe d993831fa7 writeback: add name to backing_dev_info
This enables us to track who does what and print info. Its main use
is catching dirty inodes on the default_backing_dev_info, so we can
fix that up.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:26 +02:00
Ed Cashin 7135a71b19 aoe: allocate unused request_queue for sysfs
Andy Whitcroft reported an oops in aoe triggered by use of an
incorrectly initialised request_queue object:

  [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add
		an uninitialized object, something is seriously wrong.
  [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu
  [ 2645.959107] Call Trace:
  [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70
  [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0
  [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160
  [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe]

The request queue of an aoe device is not used but can be allocated in
code that does not sleep.

Bruno bisected this regression down to

  cd43e26f07

  block: Expose stacked device queues in sysfs

"This seems to generate /sys/block/$device/queue and its contents for
 everyone who is using queues, not just for those queues that have a
 non-NULL queue->request_fn."

Addresses http://bugs.launchpad.net/bugs/410198
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942

Note that embedding a queue inside another object has always been
an illegal construct, since the queues are reference counted and
must persist until the last reference is dropped. So aoe was
always buggy in this respect (Jens).

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Bruno Premont <bonbons@linux-vserver.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-09 14:10:18 +02:00
Geert Uytterhoeven 9413c8836a powerpc/cell: Move CBE_IOPTE_* to <asm/cell-regs.h>
As <asm/iommu.h> doesn't contain any other hardware specific definitions
but only interfaces.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:29:26 +10:00
unsik Kim a85a00a699 mg_disk: Add missing ready status check on mg_write()
When last sector is written, ready bit of status register should be
checked.

Signed-off-by: unsik Kim <donari75@gmail.com>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-28 08:57:33 +02:00
Bartlomiej Zolnierkiewicz 394c6cc63c mg_disk: fix issue with data integrity on error in mg_write()
We cannot acknowledge the sector write before checking its status
(which is done on the next loop iteration) and we also need to do
the final status register check after writing the last sector.

Fix mg_write() to match mg_write_intr() in this regard.

While at it:
- add mg_read_one() and mg_write_one() helpers
- always use MG_SECTOR_SIZE and remove MG_STORAGE_BUFFER_SIZE

[bart: thanks to Tejun for porting the patch over recent block changes]

Cc: unsik Kim <donari75@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

===================================================================
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-28 08:56:34 +02:00
unsik Kim eb32baec15 mg_disk: fix reading invalid status when use polling driver
When using polling driver, little delay is required to access
status register. Without this, host might read invalid status.

Signed-off-by: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-28 08:52:07 +02:00
unsik Kim 48f5690d45 mg_disk: remove prohibited sleep operation
mflash's polling driver operate in standard request_fn_proc's context,
sleep in this isn't permitted.

Signed-off-by: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-28 08:52:06 +02:00
Linus Torvalds bb184d11ff Merge branch 'tj-block-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc
* 'tj-block-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
  virtio_blk: mark virtio_blk with __refdata to kill spurious section mismatch
  block: sysfs fix mismatched queue_var_{store,show} in 64bit kernel
  ataflop: adjust NULL test
  block: fix failfast merge testing in elv_rq_merge_ok()
  z2ram: Small cleanup for z2ram.c
2009-07-22 10:06:33 -07:00
Rakib Mullick 4fbfff7607 virtio_blk: mark virtio_blk with __refdata to kill spurious section mismatch
The variable virtio_blk references the function virtblk_probe() (which
is in .devinit section) and also references the function
virtblk_remove() ( which is in .devexit section). So, virtio_blk
simultaneously refers .devinit and .devexit section. To avoid this
messup, we mark virtio_blk as __refdata.

We were warned by the following warning:

  LD      drivers/block/built-in.o
  WARNING: drivers/block/built-in.o(.data+0xc8dc): Section mismatch in
  reference from the variable virtio_blk to the function
  .devinit.text:virtblk_probe()
  The variable virtio_blk references
  the function __devinit virtblk_probe()
  If the reference is valid then annotate the
  variable with __init* or __refdata (see linux/init.h) or name the variable:
  *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

  WARNING: drivers/block/built-in.o(.data+0xc8e0): Section mismatch in
  reference from the variable virtio_blk to the function
  .devexit.text:virtblk_remove()
  The variable virtio_blk references
  the function __devexit virtblk_remove()
  If the reference is valid then annotate the
  variable with __exit* (see linux/init.h) or name the variable:
  *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-07-19 10:46:48 +09:00
Christoph Hellwig d9ecdea7ed virtio_blk: ioctl return value fix
Block driver ioctl methods must return ENOTTY and not -ENOIOCTLCMD if
they expect the block layer to handle generic ioctls.

This triggered a BLKROSET failure in xfsqa #200.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-07-17 21:47:46 +09:30
Christoph Hellwig 4eff3cae9c virtio_blk: don't bounce highmem requests
By default a block driver bounces highmem requests, but virtio-blk is
perfectly fine with any request that fit into it's 64 bit addressing scheme,
mapped in the kernel virtual space or not.

Besides improving performance on highmem systems this also makes the
reproducible oops in __bounce_end_io go away (but hiding the real cause).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-07-17 21:47:46 +09:30
Julia Lawall 8f47428704 ataflop: adjust NULL test
dtp is derefenced on the lines above the test !dtp, and so it cannot be
NULL at this point.

A simplified version of the semantic match that finds this problem is as
follows: (http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r@
expression x,E,E1;
identifier f,l;
position p1,p2;
@@

*x@p1->f = E1;
... when != x = E
    when != goto l;
(
*x@p2 == NULL
|
*x@p2 != NULL
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-07-17 15:29:58 +09:00
Zhaolei c9d4bc289c z2ram: Small cleanup for z2ram.c
We should use Z2MINOR_COUNT as range argument in blk_unregister_region()

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-07-15 11:27:40 +09:00
Alexey Dobriyan 405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
Linus Torvalds 04eef90c2e Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
  osdblk: Adjust queue limits to lower device's limits
  osdblk: a Linux block device for OSD objects
  MAINTAINERS: Add osd maintained files (F:)
  exofs: Avoid using file_fsync()
  exofs: Remove IBM copyrights
  exofs: Fix bio leak in error handling path (sync read)
2009-07-10 19:12:24 -07:00
Jens Axboe 8aa7e847d8 Fix congestion_wait() sync/async vs read/write confusion
Commit 1faa16d228 accidentally broke
the bdi congestion wait queue logic, causing us to wait on congestion
for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-07-10 20:31:53 +02:00
Joe Perches ad361c9884 Remove multiple KERN_ prefixes from printk formats
Commit 5fd29d6ccb ("printk: clean up
handling of log-levels and newlines") changed printk semantics.  printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.

<level> is now included in the output on each additional use.

Remove all uses of multiple KERN_<level>s in formats.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-08 10:30:03 -07:00
Hannes Reinecke b59e64d0dd cciss: Ignore stale commands after reboot
When doing an unexpected shutdown like kexec the cciss
firmware might still have some commands in flight, which
it is trying to complete.
The driver is doing it's best on resetting the HBA,
but sadly there's a firmware issue causing the firmware
_not_ to abort or drop old commands.
So the firmware will send us commands which we haven't
accounted for, causing the driver to panic.

With this patch we're just ignoring these commands as
there is nothing we could be doing with them anyway.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <axboe@carl.(none)>
2009-07-03 21:06:45 +02:00
Jiri Slaby 8516a50002 floppy: fix lock imbalance
A crappy macro prevents us unlocking on a fail path.

Expand the macro and unlock appropriatelly.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-30 18:56:01 -07:00
Boaz Harrosh bc47df0fa7 osdblk: Adjust queue limits to lower device's limits
call blk_queue_stack_limits() to copy queue limits from
the underline osd scsi_device. This is absolutely needed
because osdblk cannot sleep when allocating a lower-request and
therefore cannot be bouncing.

TODO: Dynamic changes of limits to the lower device queue
will not reflect in the upper driver

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-06-24 12:27:01 +03:00
Jeff Garzik 2a13877c5e osdblk: a Linux block device for OSD objects
Submitted driver exports a block device of the form /dev/osdblkX,
where X is a decimal number.

It does that by mounting a stacking block device on top
of an osd object. For example, if you create a 2G object
on an OSD device, you can then use this module to present
that 2G object as a Linux block device.

See inside patch for exact documentation.

[Sitting at linux-next helped fix proper Kconfig dependency
 for this driver, thanks to Randy Dunlap]

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-06-24 12:25:02 +03:00
Christoph Hellwig ddeb9c3e94 hd: stop defining MAJOR_NR
Just use HD_MAJOR directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-18 09:56:20 +02:00
Linus Torvalds 6fd03301d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (64 commits)
  debugfs: use specified mode to possibly mark files read/write only
  debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
  xen: remove driver_data direct access of struct device from more drivers
  usb: gadget: at91_udc: remove driver_data direct access of struct device
  uml: remove driver_data direct access of struct device
  block/ps3: remove driver_data direct access of struct device
  s390: remove driver_data direct access of struct device
  parport: remove driver_data direct access of struct device
  parisc: remove driver_data direct access of struct device
  of_serial: remove driver_data direct access of struct device
  mips: remove driver_data direct access of struct device
  ipmi: remove driver_data direct access of struct device
  infiniband: ehca: remove driver_data direct access of struct device
  ibmvscsi: gadget: at91_udc: remove driver_data direct access of struct device
  hvcs: remove driver_data direct access of struct device
  xen block: remove driver_data direct access of struct device
  thermal: remove driver_data direct access of struct device
  scsi: remove driver_data direct access of struct device
  pcmcia: remove driver_data direct access of struct device
  PCIE: remove driver_data direct access of struct device
  ...

Manually fix up trivial conflicts due to different direct driver_data
direct access fixups in drivers/block/{ps3disk.c,ps3vram.c}
2009-06-16 12:57:37 -07:00
Linus Torvalds d613839ef9 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: remove some includings of blktrace_api.h
  mg_disk: seperate mg_disk.h again
  block: Introduce helper to reset queue limits to default values
  cfq: remove extraneous '\n' in blktrace output
  ubifs: register backing_dev_info
  btrfs: properly register fs backing device
  block: don't overwrite bdi->state after bdi_init() has been run
  cfq: cleanup for last_end_request in cfq_data
2009-06-16 11:46:45 -07:00
Linus Torvalds 609106b9ac Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (38 commits)
  ps3flash: Always read chunks of 256 KiB, and cache them
  ps3flash: Cache the last accessed FLASH chunk
  ps3: Replace direct file operations by callback
  ps3: Switch ps3_os_area_[gs]et_rtc_diff to EXPORT_SYMBOL_GPL()
  ps3: Correct debug message in dma_ioc0_map_pages()
  drivers/ps3: Add missing annotations
  ps3fb: Use ps3_system_bus_[gs]et_drvdata() instead of direct access
  ps3flash: Use ps3_system_bus_[gs]et_drvdata() instead of direct access
  ps3: shorten ps3_system_bus_[gs]et_driver_data to ps3_system_bus_[gs]et_drvdata
  ps3: Use dev_[gs]et_drvdata() instead of direct access for system bus devices
  block/ps3: remove driver_data direct access of struct device
  ps3vram: Make ps3vram_priv.reports a void *
  ps3vram: Remove no longer used ps3vram_priv.ddr_base
  ps3vram: Replace mutex by spinlock + bio_list
  block: Add bio_list_peek()
  powerpc: Use generic atomic64_t implementation on 32-bit processors
  lib: Provide generic atomic64_t implementation
  powerpc: Add compiler memory barrier to mtmsr macro
  powerpc/iseries: Mark signal_vsp_instruction() as maybe unused
  powerpc/iseries: Fix unused function warning in iSeries DT code
  ...
2009-06-16 11:30:37 -07:00
Li Zefan e212d6f250 block: remove some includings of blktrace_api.h
When porting blktrace to tracepoints, we changed to trace/block.h
for trace prober declarations.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-16 11:19:36 +02:00
unsik Kim 5ced504b1b mg_disk: seperate mg_disk.h again
eec9462088 fold mg_disk.h into mg_disk.c,
but mg_disk platform driver needs private data for operation. This also
make mg_disk.c as machine independent. Seperate only needed structure and
defines to mg_disk.h

Signed-off-by: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-16 08:40:20 +02:00
GeunSik Lim 156f5a7801 debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/"
directory name to mount debugfs filesystem for ftrace according to
./Documentation/tracers/ftrace.txt file.

And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is
existed in kernel source like ftrace, DRM, Wireless, Documentation,
Network[sky2]files to mount debugfs filesystem.

debugfs means debug filesystem for debugging easy to use by greg kroah
hartman. "/sys/kernel/debug/" name is suitable as directory name
of debugfs filesystem.
- debugfs related reference: http://lwn.net/Articles/334546/

Fix inconsistency of directory name to mount debugfs filesystem.

* From Steven Rostedt
  - find_debugfs() and tracing_files() in this patch.

Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com>
Acked-by     : Inaky Perez-Gonzalez <inaky@linux.intel.com>
Reviewed-by  : Steven Rostedt <rostedt@goodmis.org>
Reviewed-by  : James Smart <james.smart@emulex.com>
CC: Jiri Kosina <trivial@kernel.org>
CC: David Airlie <airlied@linux.ie>
CC: Peter Osterlund <petero2@telia.com>
CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
CC: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
CC: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:28 -07:00
Roel Kluin 5888fd30ac block/ps3: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:28 -07:00
Greg Kroah-Hartman a1b4b12b37 xen block: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.


Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.osdl.org
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:27 -07:00
Kay Sievers 1ce8a0d396 Driver Core: aoe: add nodename for aoe devices
This adds support to the AOE core to report the proper device name to
userspace for the AOE devices.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:26 -07:00
Kay Sievers b03f38b685 Driver Core: block: add nodename support for block drivers.
This adds support for block drivers to report their requested nodename
to userspace.  It also updates a number of block drivers to provide the
needed subdirectory and device name to be used for them.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:25 -07:00
David S. Miller 9cbc1cb8cd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
	drivers/scsi/fcoe/fcoe.c
	net/core/drop_monitor.c
	net/core/net-traces.c
2009-06-15 03:02:23 -07:00
Geert Uytterhoeven 03fa68c245 ps3: shorten ps3_system_bus_[gs]et_driver_data to ps3_system_bus_[gs]et_drvdata
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Geoff Levand <geoffrey.levand@am.sony.com>
Cc: Jim Paris <jim@jtan.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 16:47:24 +10:00
Roel Kluin 6dee2c87eb block/ps3: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.

[Geert: Use ps3_system_bus_[gs]et_driver_data() for ps3_system_bus_device]

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 16:47:23 +10:00
Geert Uytterhoeven 1bd9784f5e ps3vram: Make ps3vram_priv.reports a void *
So we can kill a cast.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 16:47:23 +10:00
Geert Uytterhoeven c3b94fd800 ps3vram: Remove no longer used ps3vram_priv.ddr_base
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 16:47:22 +10:00
Geert Uytterhoeven fb89e89d0f ps3vram: Replace mutex by spinlock + bio_list
Remove the mutex serializing access to the cache.
Instead, queue up new requests on a bio_list if the driver is busy.

This improves sequential write performance by ca. 2%.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 16:47:22 +10:00
Geert Uytterhoeven d3352c9f1e ps3fb/vram: Extract common GPU stuff into <asm/ps3gpu.h>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: linux-fbdev-devel@lists.sourceforge.net
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:26:20 +10:00
Geert Uytterhoeven 56ac72dba5 ps3vram: GPU memory mapping cleanup
- Make the IOMMU flags used for mapping main memory into the GPU's I/O space
    explicit, instead of relying on the default in the hypervisor,
  - Add missing calls to lv1_gpu_context_iomap(..., CBE_IOPTE_M) to unmap the
    memory during cleanup.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:26:20 +10:00
Jim Paris 3273d8778f ps3vram: Correct exchanged gotos in ps3vram_probe() error path
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:26:18 +10:00
Geert Uytterhoeven 3c20e2f279 ps3vram: Use proc_create_data() instead of proc_create()
Use proc_create_data() to avoid race conditions.

Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:26:18 +10:00
Geert Uytterhoeven 734957c897 ps3vram: Fix error path (return -EIO) for short read/write
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:26:17 +10:00
Linus Torvalds 489f7ab6c1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (31 commits)
  trivial: remove the trivial patch monkey's name from SubmittingPatches
  trivial: Fix a typo in comment of addrconf_dad_start()
  trivial: usb: fix missing space typo in doc
  trivial: pci hotplug: adding __init/__exit macros to sgi_hotplug
  trivial: Remove the hyphen from git commands
  trivial: fix ETIMEOUT -> ETIMEDOUT typos
  trivial: Kconfig: .ko is normally not included in module names
  trivial: SubmittingPatches: fix typo
  trivial: Documentation/dell_rbu.txt: fix typos
  trivial: Fix Pavel's address in MAINTAINERS
  trivial: ftrace:fix description of trace directory
  trivial: unnecessary (void*) cast removal in sound/oss/msnd.c
  trivial: input/misc: Fix typo in Kconfig
  trivial: fix grammo in bus_for_each_dev() kerneldoc
  trivial: rbtree.txt: fix rb_entry() parameters in sample code
  trivial: spelling fix in ppc code comments
  trivial: fix typo in bio_alloc kernel doc
  trivial: Documentation/rbtree.txt: cleanup kerneldoc of rbtree.txt
  trivial: Miscellaneous documentation typo fixes
  trivial: fix typo milisecond/millisecond for documentation and source comments.
  ...
2009-06-14 13:46:25 -07:00
Linus Torvalds 02a99ed620 Merge branch 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze: (55 commits)
  microblaze: Don't use access_ok for unaligned
  microblaze: remove unused flat_stack_align() definition
  microblaze: Fix problem with early_printk in startup
  microblaze_mmu_v2: Makefiles
  microblaze_mmu_v2: Kconfig update
  microblaze_mmu_v2: stat.h MMU update
  microblaze_mmu_v2: Elf update
  microblaze_mmu_v2: Update dma.h for MMU
  microblaze_mmu_v2: Update cacheflush.h
  microblaze_mmu_v2: Update signal returning address
  microblaze_mmu_v2: Traps MMU update
  microblaze_mmu_v2: Enable fork syscall for MMU and add fork as vfork for noMMU
  microblaze_mmu_v2: Update linker script for MMU
  microblaze_mmu_v2: Add MMU related exceptions handling
  microblaze_mmu_v2: uaccess MMU update
  microblaze_mmu_v2: Update exception handling - MMU exception
  microblaze_mmu_v2: entry.S, entry.h
  microblaze_mmu_v2: Add CURRENT_TASK for entry.S
  microblaze_mmu_v2: MMU asm offset update
  microblaze_mmu_v2: Update tlb.h and tlbflush.h
  ...
2009-06-12 13:15:17 -07:00
Pavel Machek 4737f0978d trivial: Kconfig: .ko is normally not included in module names
.ko is normally not included in Kconfig help, make it consistent.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-06-12 18:01:50 +02:00
Mike Frysinger 98e9444474 virtio_blk: add missing __dev{init,exit} markings
The remove member of the virtio_driver structure uses __devexit_p(), so
the remove function itself should be marked with __devexit.  And where
there be __devexit on the remove, so is there __devinit on the probe.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:39 +09:30
Michael S. Tsirkin d2a7ddda9f virtio: find_vqs/del_vqs virtio operations
This replaces find_vq/del_vq with find_vqs/del_vqs virtio operations,
and updates all drivers. This is needed for MSI support, because MSI
needs to know the total number of vectors upfront.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ lguest/9p compile fixes)
2009-06-12 22:16:36 +09:30
Rusty Russell 9499f5e7ed virtio: add names to virtqueue struct, mapping from devices to queues.
Add a linked list of all virtqueues for a virtio device: this helps for
debugging and is also needed for upcoming interface change.

Also, add a "name" field for clearer debug messages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:36 +09:30
Ondrej Zary 5e50b9ef97 floppy: fix hibernation
Based on Ingo Molnar's patch from 2006, this makes the floppy work after
resume from hibernation, at least on my machine.

This fix resets the floppy controller on resume.  It was experimentally
determined to bring the controller back to life - we don't really know why
it works.

floppy_init() does the same thing at boot/modprobe time.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-10 23:07:16 +02:00
Robert P. J. Day 1adbee50fd ramdisk: remove long-deprecated "ramdisk=" boot-time parameter
The "ramdisk" parameter was removed from the defunct rd.c file quite some
time ago, in favour of the more specific "ramdisk_size" parameter so, for
consistency, the same should be done here.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-10 23:07:15 +02:00
john cooper 1d589bb16b Add serial number support for virtio_blk, V4a
This patch extracts the opaque data from pci i/o
region 0 via the added VIRTIO_BLK_F_IDENTIFY
field.  By convention this data takes the form of
that returned by an ATA IDENTIFY DEVICE command,
however the driver (except for structure size)
makes no interpretation of the data.  The structure
data is copied wholesale to userspace via a
HDIO_GET_IDENTITY ioctl command (eg: hdparm -i <dev>).

Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 14:41:40 +02:00
scameron@beardog.cca.cpqcorp.net 3969251b80 cciss: decode unit attention in SCSI error handling code
Make SCSI reset error handler decode unit attention ASC
and after a target reset wait for a unit attention that indicates
a reset occurred rather than just for any old unit attention.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:44 +02:00
scameron@beardog.cca.cpqcorp.net 72f9f1324f cciss: Remove no longer needed sendcmd reject processing code
Now that the cciss SCSI error handling routines operate with interrupts
enabled, we no longer need to maintain the list of command completions that
sendcmd() might inadvertantly scoop up, since now it only runs at driver init
time, and there won't be any other commands for it to scoop up.  So we
can remove that list and the code that adds to it and processes it.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:43 +02:00
scameron@beardog.cca.cpqcorp.net 85cc61ae41 cciss: change SCSI error handling routines to work with interrupts enabled.
Change cciss scsi error handling routines to work with interrupts enabled.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:43 +02:00
scameron@beardog.cca.cpqcorp.net 789a424ad1 cciss: separate error processing and command retrying code in sendcmd_withirq_core()
Separate the error processing from sendcmd_withirq_core from the code
which retries commands.  The rationale for this is that the SCSI error
handling code can then be made to use sendcmd_withirq_core, but avoid
retrying commands.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:43 +02:00
scameron@beardog.cca.cpqcorp.net 3c2ab40296 cciss: factor out fix target status processing code from sendcmd functions
Factor out code to process target status of completed commands in sendcmd()
and sendcmd_withirq_core(), and fix problem that bad target status was ignored in
sendcmd_withirq_core.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:43 +02:00
scameron@beardog.cca.cpqcorp.net b57695fe13 cciss: simplify interface of sendcmd() and sendcmd_withirq()
Simplify interfaces of sendcmd() and sendcmd_withirq() so that they
provide only one way to address commands instead of three ways.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:42 +02:00
scameron@beardog.cca.cpqcorp.net 5390cfc3fe cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code
Factor the core of sendcmd_withirq out to provide a simpler interface
which provides access to full error information.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:42 +02:00
scameron@beardog.cca.cpqcorp.net 40df6ae427 cciss: Use schedule_timeout_uninterruptible in SCSI error handling code
Use schedule_timeout_uninterruptible instead of schedule_timeout in the
scsi error handling code when waiting between TUR polls since we are not
interested in nor want to be interrupted by signals.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-09 05:47:42 +02:00
Andrew Morton 77b0308a07 cciss: use schedule_timeout_interruptible()
Use schedule_timeout_interruptible() instead of open-coding the set and
schedule parts.

Cc: Mike Miller <mikem@beardog.cca.cpqcorp.net>
Cc: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-02 14:51:30 +02:00
Andrew Patterson 7fe063268e cciss: add cciss driver sysfs entries
Add sysfs entries to the cciss driver needed for the dm/multipath tools.

A file for vendor, model, rev, and unique_id is added for each logical
drive under directory /sys/bus/pci/devices/<dev>/ccissX/cXdY.  Where X =
the controller (or host) number and Y is the logical drive number.

A link from /sys/bus/pci/devices/<dev>/ccissX/cXdY/block:cciss!cXdY to
/sys/block/cciss!cXdY/device is also created.  A bus is created in
/sys/bus/cciss.  A link is created from the pci ccissX entry to
/sys/bus/cciss/devices/ccissX.  Please consider this for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-02 14:48:39 +02:00
Stephen M. Cameron 88f627ae39 cciss: fix SCSI device reset handler
Fix the SCSI reset error handler to send a working, properly addressed
reset message to the target device and add code to wait for the target
device to become ready by polling it with Test Unit Ready.

The existing reset code was broken in that it didn't bother to set the
8-byte LUN address to anything besides zero, so the command was addressed
to the controller, which pretended to the driver that the command
succeeded, while doing nothing.  Ages ago I tested this code, but
unbeknownst to me, my test was flawed, and what I thought was a tape drive
getting reset was actually nothing of the sort.  Unfortunately, there is
still lots of Smartarray firmware that doesn't handle doing target resets
right, and this code won't help in those cases, but it also shouldn't make
things worse in those cases than they already are.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Cc: Mike Miller <mikem@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-02 14:48:11 +02:00
Stephen M. Cameron 4a4b2d7684 cciss: factor out core of sendcmd() for a more sane interface
Factor out the core of sendcmd() to provide a simpler interface which
exposes all the error information to the caller and make the original
sendcmd use this new function.  Rationale: The SCSI error handling
routines need to send commands with interrupts turned off, but they also
need access to the full error information.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Cc: Mike Miller <mikem@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-02 14:47:50 +02:00
David S. Miller 438263ac58 aoe: Remove superfluous clearing of skb fields in new_skb().
This code uses alloc_skb() which clears them out for us.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-27 17:09:44 -07:00
Martin K. Petersen ae03bf639a block: Use accessor functions for queue limits
Convert all external users of queue limits to using wrapper functions
instead of poking the request queue variables directly.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-22 23:22:54 +02:00
Martin K. Petersen e1defc4ff0 block: Do away with the notion of hardsect_size
Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case.  The
sector size will be 4KB but the logical block size will remain
512-bytes.  Hence we need to distinguish between the physical block size
and the logical ditto.

This patch renames hardsect_size to logical_block_size.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-22 23:22:54 +02:00
Jens Axboe 9bd7de51ee Merge branch 'master' into for-2.6.31
Conflicts:
	drivers/ide/ide-io.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-22 20:28:35 +02:00
Roel Kluin b9ed7252d2 xen-blkfront: beyond ARRAY_SIZE of info->shadow
Do not go beyond ARRAY_SIZE of info->shadow
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-22 09:59:51 +02:00
Michal Simek 6fa612b56c microblaze: Kconfig: Enable drivers for Microblaze
Signed-off-by: Michal Simek <monstr@monstr.eu>
2009-05-21 15:56:04 +02:00
Tejun Heo 5f49f63178 block: set rq->resid_len to blk_rq_bytes() on issue
In commit c3a4d78c58, while introducing
rq->resid_len, the default value of residue count was changed from
full count to zero.  The conversion was done under the assumption that
when a request fails residue count wasn't defined.  However, Boaz and
James pointed out that this wasn't true and the residue count should
be preserved for failed requests too.

This patchset restores the original behavior by setting rq->resid_len
to blk_rq_bytes(rq) on request start and restoring explicit clearing
in affected drivers.  While at it, take advantage of the fact that
rq->resid_len is set to full count where applicable.

* ide-cd: rq->resid_len cleared on pc success

* mptsas: req->resid_len cleared on success

* sas_expander: rsp/req->resid_len cleared on success

* mpt2sas_transport: req->resid_len cleared on success

* ide-cd, ide-tape, mptsas, sas_host_smp, mpt2sas_transport, ub: take
  advantage of initial full count to simplify code

Boaz Harrosh spotted bug in resid_len initialization.  Fixed as
suggested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Borislav Petkov <petkovbb@googlemail.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-19 11:36:08 +02:00
Tejun Heo 3755100dd5 ub: use __blk_end_request_all()
ub_end_rq() always tries to complete full request.  The @cmd_len
parameter was there because rq->data_len used to be overwritten with
residue count.  Drop @cmd_len and use __blk_end_request_all().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-19 11:36:08 +02:00
Ian Campbell 31a14400e8 xen/blkfront: fix warning when deleting gendisk on unplug/shutdown
Currently blkfront gives a warning when hot unplugging due to calling
del_gendisk() with interrupts disabled (due to blkif_io_lock).

WARNING: at kernel/softirq.c:124 local_bh_enable+0x36/0x84()
Modules linked in: xenfs xen_netfront ext3 jbd mbcache xen_blkfront
Pid: 13, comm: xenwatch Not tainted 2.6.29-xs5.5.0.13 #3
Call Trace:
 [<c012611c>] warn_slowpath+0x80/0xb6
 [<c0104cf1>] xen_sched_clock+0x16/0x63
 [<c0104710>] xen_force_evtchn_callback+0xc/0x10
 [<c0104e32>] check_events+0x8/0xe
 [<c0104d9b>] xen_restore_fl_direct_end+0x0/0x1
 [<c0103749>] xen_mc_flush+0x10a/0x13f
 [<c0105bd2>] __switch_to+0x114/0x14e
 [<c011d92b>] dequeue_task+0x62/0x70
 [<c0123b6f>] finish_task_switch+0x2b/0x84
 [<c0299877>] schedule+0x66d/0x6e7
 [<c0104710>] xen_force_evtchn_callback+0xc/0x10
 [<c0104710>] xen_force_evtchn_callback+0xc/0x10
 [<c012a642>] local_bh_enable+0x36/0x84
 [<c022f9a7>] sk_filter+0x57/0x5c
 [<c0233dae>] netlink_broadcast+0x1d5/0x315
 [<c01c6371>] kobject_uevent_env+0x28d/0x331
 [<c01e7ead>] device_del+0x10f/0x120
 [<c01e7ec6>] device_unregister+0x8/0x10
 [<c015f86d>] bdi_unregister+0x2d/0x39
 [<c01bf6f4>] unlink_gendisk+0x23/0x3e
 [<c01ac946>] del_gendisk+0x7b/0xe7
 [<d0828c19>] blkfront_closing+0x28/0x6e [xen_blkfront]
 [<d082900c>] backend_changed+0x3ad/0x41d [xen_blkfront]

We can fix this by calling del_gendisk() later in blkfront_closing, after
releasing blkif_io_lock. Since the queue is stopped during the interrupts
disabled phase I don't think there is any danger of an event occuring between
releasing the blkif_io_lock and deleting the disk.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-19 08:27:42 +02:00
Ian Campbell 28afea5b2f xen/blkfront: allow xenbus state transition to Closing->Closed when not Connected
This situation can occur when attempting to attach a block device whose
backend is an empty physical CD-ROM driver. The backend in this case
will go directly from the Initialising state to Closing->Closed.
Previously this would result in a NULL pointer deref on info->gd
(xenbus_dev_fatal does not return as a1a15ac5 seems to expect)

Cc: stable@kernel.org
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-19 08:25:48 +02:00
Jens Axboe f831cc0349 virtio_blk: get rid of unused variable
drivers/block/virtio_blk.c: In function 'blk_done':
drivers/block/virtio_blk.c:53: warning: unused variable 'nr_bytes'

Leftover from commit 1cde26f928

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-18 14:44:45 +02:00
Hannes Reinecke 1cde26f928 virtio_blk: SG_IO passthru support
Add support for SG_IO passthru to virtio_blk.  We add the scsi command
block after the normal outhdr, and the scsi inhdr with full status
information aswell as the sense buffer before the regular inhdr.

[hch: forward ported, added the VIRTIO_BLK_F_SCSI flags, some comments
 and tested the whole beast]
[axboe: updated to use ->resid and not dual-path the byte count]

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ checkpatch.pl tweak)
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-18 14:41:30 +02:00
Christoph Hellwig 6c3b46f745 virtio_blk: don't blindly derefence req->rq_disk
request->rq_disk is only set for FS requests or BLOCK_PC requests
originating from the generic block layer scsi ioctls.  It's not set
for requests origination from other soures or internal cache flush
commands implemented by the patch I'll send after this.

So instead of using it to get at the private data in do_virtblk_request
setup queue->queuedata and use it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-18 14:38:28 +02:00
Miklos Szeredi 6818173bd6 splice: implement default splice_read method
If f_op->splice_read() is not implemented, fall back to a plain read.
Use vfs_readv() to read into previously allocated pages.

This will allow splice and functions using splice, such as the loop
device, to work on all filesystems.  This includes "direct_io" files
in fuse which bypass the page cache.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 14:13:10 +02:00
Tejun Heo 9934c8c045 block: implement and enforce request peek/start/fetch
Till now block layer allowed two separate modes of request execution.
A request is always acquired from the request queue via
elv_next_request().  After that, drivers are free to either dequeue it
or process it without dequeueing.  Dequeue allows elv_next_request()
to return the next request so that multiple requests can be in flight.

Executing requests without dequeueing has its merits mostly in
allowing drivers for simpler devices which can't do sg to deal with
segments only without considering request boundary.  However, the
benefit this brings is dubious and declining while the cost of the API
ambiguity is increasing.  Segment based drivers are usually for very
old or limited devices and as converting to dequeueing model isn't
difficult, it doesn't justify the API overhead it puts on block layer
and its more modern users.

Previous patches converted all block low level drivers to dequeueing
model.  This patch completes the API transition by...

* renaming elv_next_request() to blk_peek_request()

* renaming blkdev_dequeue_request() to blk_start_request()

* adding blk_fetch_request() which is combination of peek and start

* disallowing completion of queued (not started) requests

* applying new API to all LLDs

Renamings are for consistency and to break out of tree code so that
it's apparent that out of tree drivers need updating.

[ Impact: block request issue API cleanup, no functional change ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: unsik Kim <donari75@gmail.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Laurent Vivier <Laurent@lvivier.info>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Stefan Weinhuber <wein@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:18 +02:00
Tejun Heo 296b2f6ae6 block: convert to dequeueing model (easy ones)
plat-omap/mailbox, floppy, viocd, mspro_block, i2o_block and
mmc/card/queue are already pretty close to dequeueing model and can be
converted with simple changes.  Convert them.

While at it,

* xen-blkfront: !fs check moved downwards to share dequeue call with
  normal path.

* mspro_block: __blk_end_request(..., blk_rq_cur_byte()) converted to
  __blk_end_request_cur()

* mmc/card/queue: loop of __blk_end_request() converted to
  __blk_end_request_all()

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:17 +02:00
Tejun Heo fb3ac7f6b8 z2ram: dequeue in-flight request
z2ram processes requests one-by-one synchronously and can be easily
converted to dequeueing model.  Convert it.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:17 +02:00
Tejun Heo bab2a807a4 xd: dequeue in-flight request
xd processes requests one-by-one synchronously and can be easily
converted to dequeueing model.  Convert it.

While at it, use rq_cur_bytes instead of rq_bytes when checking for
sector overflow.  This is for for consistency and better behavior for
merged requests.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:16 +02:00
Tejun Heo 06b0608e2b swim: dequeue in-flight request
swim processes requests one-by-one synchronously and can easily be
converted to dequeuing model.  Convert it.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Laurent Vivier <Laurent@lvivier.info>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:16 +02:00
Tejun Heo 9e31bebee2 amiflop: dequeue in-flight request
Request processing in amiflop is done sequentially in
redo_fd_request() proper and redo_fd_request() can easily be converted
to track in-flight request.  Remove CURRENT, track in-flight request
directly and dequeue it when processing starts.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:16 +02:00
Tejun Heo 10e1e629b3 ps3disk: dequeue in-flight request
Other than in issue error paths, ps3disk always completely finishes
fetched requests.  With full completion on error paths, it can be
easily converted to dequeueing model.

* After L1 r/w call failure, ps3disk_submit_request_sg() now fails the
  whole request.  Issue failure isn't likely to benefit from partial
  retry anyway and ps3disk uses full failure in completion error path
  too, so I don't think this amounts to any meaningful functionality
  loss.

* flush completion is converted to _all for consistency.  It doesn't
  make any functional difference.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:16 +02:00
Tejun Heo b12d4f82c1 paride: dequeue in-flight request
pd/pf/pcd have track in-flight request by pd/pf/pcd_req.  They can be
converted to dequeueing model by updating fetching and completion
paths.  Convert them.

Note that removal of elv_next_request() call from pf_next_buf()
doesn't make any functional difference.  The path is traveled only
during partial completion of a request and elv_next_request() call
must return the same request anyway.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Tim Waugh <tim@cyberelk.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:16 +02:00
Tejun Heo 2d75ce084e xsysace: dequeue in-flight request
xsysace already tracks in-flight request using ace->req.  Converting
to dequeueing model is mostly a matter of adding dequeueing call after
request fetching.  The only tricky part is handling CF removal which
should complete both in flight and on queue requests.  Convert to
dequeueing model.

While at it, remove explicit blk_rq_cur_bytes() and use
__blk_end_request_cur() instead.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:15 +02:00
Tejun Heo f4bd4b90bf swim3: dequeue in-flight request
swim3 has at most single request in flight and already tracks it using
fd_req.  Convert it to dequeuing model by updating request fetching
and wrapping completion function.

[ Impact: dequeue in-flight request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:15 +02:00
Tejun Heo a336ca6fe6 ataflop: dequeue and track in-flight request
ataflop has single request in flight.  Till now, whenever it needs to
access the in-flight request it called elv_next_request().  This patch
makes ataflop track the in-flight request directly and dequeue it when
processing starts.  The added complexity is minimal and this will help
future block layer changes.

[ Impact: dequeue in-flight request, one elv_next_request() per request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:15 +02:00
Tejun Heo 8a12c4a456 hd: dequeue and track in-flight request
hd has at most single request in flight.  Till now, whenever it needs
to access the in-flight request it called elv_next_request().  This
patch makes hd track the in-flight request directly and dequeue it
when processing starts.  The added complexity is minimal and this will
help future block layer changes.

[ Impact: dequeue in-flight request, one elv_next_request() per request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:15 +02:00
Tejun Heo 5b36ad6000 mg_disk: dequeue and track in-flight request
mg_disk has at most single request in flight per device.  Till now,
whenever it needs to access the in-flight request it called
elv_next_request().  This patch makes mg_disk track the in-flight
request directly using mg_host->req and dequeue it when processing
starts.

q->queuedata is set to mg_host so that mg_host can be determined
without fetching request from the queue.

[ Impact: dequeue in-flight request, one elv_next_request() per request ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:15 +02:00
Tejun Heo 9a8d23d885 mg_disk: fix queue hang / infinite retry on !fs requests
Both request functions in mg_disk simply return when they encounter a
!fs request, which means the request will never be cleared from the
queue causing queue hang and indefinite retry of the request.  Fix it.

While at it, flatten condition checks and add unlikely to !fs tests.

[ Impact: fix possible queue hang / infinite retry of !fs requests ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:52:14 +02:00
Tejun Heo 1011c1b9f2 block: blk_rq_[cur_]_{sectors|bytes}() usage cleanup
With the previous changes, the followings are now guaranteed for all
requests in any valid state.

* blk_rq_sectors() == blk_rq_bytes() >> 9
* blk_rq_cur_sectors() == blk_rq_cur_bytes() >> 9

Clean up accessor usages.  Notable changes are

* nbd,i2o_block: end_all used instead of explicit byte count
* scsi_lib: unnecessary conditional on request type removed

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:50:55 +02:00
Tejun Heo b079041030 block: cleanup rq->data_len usages
With recent unification of fields, it's now guaranteed that
rq->data_len always equals blk_rq_bytes().  Convert all non-IDE direct
users to accessors.  IDE will be converted in a separate patch.

Boaz: spotted incorrect data_len/resid_len conversion in osd.

[ Impact: convert direct rq->data_len usages to blk_rq_bytes() ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:50:55 +02:00
Tejun Heo 83096ebf12 block: convert to pos and nr_sectors accessors
With recent cleanups, there is no place where low level driver
directly manipulates request fields.  This means that the 'hard'
request fields always equal the !hard fields.  Convert all
rq->sectors, nr_sectors and current_nr_sectors references to
accessors.

While at it, drop superflous blk_rq_pos() < 0 test in swim.c.

[ Impact: use pos and nr_sectors accessors ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Tested-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Acked-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Acked-by: Mike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Dario Ballabio <ballabio_dario@emc.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: unsik Kim <donari75@gmail.com>
Cc: Laurent Vivier <Laurent@lvivier.info>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:50:54 +02:00
Tejun Heo 5b93629b45 block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones
Implement accessors - blk_rq_pos(), blk_rq_sectors() and
blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
and rq->hard_cur_sectors respectively and convert direct references of
the said fields to the accessors.

This is in preparation of request data length handling cleanup.

Geert	: suggested adding const to struct request * parameter to accessors
Sergei	: spotted error in patch description

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Ackec-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:50:53 +02:00
Tejun Heo c3a4d78c58 block: add rq->resid_len
rq->data_len served two purposes - the length of data buffer on issue
and the residual count on completion.  This duality creates some
headaches.

First of all, block layer and low level drivers can't really determine
what rq->data_len contains while a request is executing.  It could be
the total request length or it coulde be anything else one of the
lower layers is using to keep track of residual count.  This
complicates things because blk_rq_bytes() and thus
[__]blk_end_request_all() relies on rq->data_len for PC commands.
Drivers which want to report residual count should first cache the
total request length, update rq->data_len and then complete the
request with the cached data length.

Secondly, it makes requests default to reporting full residual count,
ie. reporting that no data transfer occurred.  The residual count is
an exception not the norm; however, the driver should clear
rq->data_len to zero to signify the normal cases while leaving it
alone means no data transfer occurred at all.  This reverse default
behavior complicates code unnecessarily and renders block PC on some
drivers (ide-tape/floppy) unuseable.

This patch adds rq->resid_len which is used only for residual count.

While at it, remove now unnecessasry blk_rq_bytes() caching in
ide_pc_intr() as rq->data_len is not changed anymore.

Boaz	: spotted missing conversion in osd
Sergei	: spotted too early conversion to blk_rq_bytes() in ide-tape

[ Impact: cleanup residual count handling, report 0 resid by default ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Borislav Petkov <petkovbb@googlemail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Eric Moore <Eric.Moore@lsi.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:50:53 +02:00
Tejun Heo 53d6979ab6 nbd: don't clear rq->sector and nr_sectors unnecessarily
There's no reason to clear rq->sector and nr_sectors after calling
blk_rq_init().  They're guaranteed to be clear.  Drop unnecessary
clearing.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11 09:50:53 +02:00
Tejun Heo 0191944282 hd: fix locking
hd dance around local irq and HD_IRQ enable without achieving much.
It ends up transferring data from irq handler with both local irq and
HD_IRQ disabled.  The only place it actually does something is while
transferring the first block of a request which it does with HD_IRQ
disabled but local irq enabled.

Unfortunately, the dancing is horribly broken from locking POV.  IRQ
and timeout handlers access block queue without grabbing the queue
lock and running the driver in SMP configuration crashes the whole
machine pretty quickly.

Remove meaningless irq enable/disable dancing and add proper locking
in issue, irq and timeout paths.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-04-28 20:24:20 +02:00
Bartlomiej Zolnierkiewicz 0d9f346fb0 mg_disk: fix CONFIG_LBD=y warning
drivers/block/mg_disk.c: In function ‘mg_dump_status’:
drivers/block/mg_disk.c:265: warning: format ‘%ld’ expects type ‘long int’, but
argument 2 has type ‘sector_t’

[ Impact: kill build warning ]

Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-04-28 20:24:20 +02:00
Tejun Heo 39f36b47ca mg_disk: fix locking
IRQ and timeout handlers call functions which expect locked queue lock
without locking it.  Fix it.

While at it, convert 0s used as null pointer constant to NULLs.

[ Impact: fix locking, cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: unsik Kim <donari75@gmail.com>
2009-04-28 20:24:19 +02:00
Bartlomiej Zolnierkiewicz f68adec3c7 mg_disk: use defines from <linux/ata.h>
While at it:
- remove MG_REG_HEAD_MUST_BE_ON define
- remove MG_REG_CTRL_INTR_ENABLE define
- remove MG_REG_HEAD_LBA_MODE define
- remove unused defines

Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:52 +02:00
Bartlomiej Zolnierkiewicz 8a11a789c3 mg_disk: fix dependency on libata
Add local copies of ata_id_string() and ata_id_c_string() to mg_disk
so there is no need for the driver to depend on ATA and SCSI.

[ Impact: break dependency on libata by copying ata id string functions ]

Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:52 +02:00
Tejun Heo a03bb5a32f mg_disk: clean up request completion paths
mg_disk implements its own partial completion.  Convert to standard
block layer partial completion.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:51 +02:00
Tejun Heo eec9462088 mg_disk: fold mg_disk.h into mg_disk.c
include/linux/mg_disk.h is used only by drivers/block/mg_disk.c.  No
reason to put it in a separate header.  Fold it into mg_disk.c.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:51 +02:00
Tejun Heo e138b4e08e swim: clean up request completion paths
swim curiously tries to update request parameters before calling
__blk_end_request() when __blk_end_request() will do it anyway and
unnecessarily checks whether current_nr_sectors is zero right after
fetching.

Drop unnecessary stuff and use standard block layer mechanisms.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Laurent Vivier <Laurent@lvivier.info>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:51 +02:00
Tejun Heo 467ca759fc swim3: clean up request completion paths
swim3 curiously tries to update request parameters before calling
__blk_end_request() when __blk_end_request() will do it anyway, and it
updates request for partial completion manually instead of using
blk_update_request().  Also, it does some spurious checks on rq such
as testing whether rq->sector is negative or current_nr_sectors is
zero right after fetching.

Drop unnecessary stuff and use standard block layer mechanisms.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:51 +02:00
Tejun Heo e091eb67af hd: clean up request completion paths
hd read/write_intr() functions manually manipulate request to
incrementally complete it, which block layer already supports.  Simply
use block layer completion routines instead of manual partial
completion.

While at it, clear unnecessary elv_next_request() check at the tail of
read_intr().  This also makes read and write_intr() more consistent.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:51 +02:00
Tejun Heo 044208506d sunvdc: kill vdc_end_request()
vdc_end_request() is a thin silly wrapper on top of
__blk_end_request().  Kill it.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:50 +02:00
Tejun Heo cd4c34ebec ps3disk: simplify request completion
ps3disk_interrupt() always completes requests fully but it uses
rq->hard_cur_sectors for FLUSH requests for some reason.  Drop them
and simply use __blk_end_request_all().

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:50 +02:00
Tejun Heo 5b5c5d12b9 amiflop,ataflop,xd,mg_disk: clean up unnecessary stuff from block drivers
rq_data_dir() can only be READ or WRITE and rq->sector and nr_sectors
are always automatically updated after partial request completion.
Don't worry about rq_data_dir() not being either READ or WRITE or
manually update sector and nr_sectors.

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jörg Dorchain <joerg@dorchain.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 08:14:50 +02:00
Tejun Heo f06d9a2b52 block: replace end_request() with [__]blk_end_request_cur()
end_request() has been kept around for backward compatibility;
however, it's about time for it to go away.

* There aren't too many users left.

* Its use of @updtodate is pretty confusing.

* In some cases, newer code ends up using mixture of end_request() and
  [__]blk_end_request[_all](), which is way too confusing.

So, add [__]blk_end_request_cur() and replace end_request() with it.
Most conversions are straightforward.  Noteworthy ones are...

* paride/pcd: next_request() updated to take 0/-errno instead of 1/0.

* paride/pf: pf_end_request() and next_request() updated to take
  0/-errno instead of 1/0.

* xd: xd_readwrite() updated to return 0/-errno instead of 1/0.

* mtd/mtd_blkdevs: blktrans_discard_request() updated to return
  0/-errno instead of 1/0.  Unnecessary local variable res
  initialization removed from mtd_blktrans_thread().

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Joerg Dorchain <joerg@dorchain.net>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Laurent Vivier <Laurent@lvivier.info>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: unsik Kim <donari75@gmail.com>
2009-04-28 07:37:36 +02:00
Tejun Heo 40cbbb781d block: implement and use [__]blk_end_request_all()
There are many [__]blk_end_request() call sites which call it with
full request length and expect full completion.  Many of them ensure
that the request actually completes by doing BUG_ON() the return
value, which is awkward and error-prone.

This patch adds [__]blk_end_request_all() which takes @rq and @error
and fully completes the request.  BUG_ON() is added to to ensure that
this actually happens.

Most conversions are simple but there are a few noteworthy ones.

* cdrom/viocd: viocd_end_request() replaced with direct calls to
  __blk_end_request_all().

* s390/block/dasd: dasd_end_request() replaced with direct calls to
  __blk_end_request_all().

* s390/char/tape_block: tapeblock_end_request() replaced with direct
  calls to blk_end_request_all().

[ Impact: cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-28 07:37:35 +02:00
Akinobu Mita e686307fdc loop: use BIO list management functions
Now that the bio list management stuff is generic, convert loop to use
bio lists instead of its own private bio list implementation.

Cc:  Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-28 07:37:28 +02:00
Tejun Heo e93b9fb7d8 hd: fix locking
hd dance around local irq and HD_IRQ enable without achieving much.
It ends up transferring data from irq handler with both local irq and
HD_IRQ disabled.  The only place it actually does something is while
transferring the first block of a request which it does with HD_IRQ
disabled but local irq enabled.

Unfortunately, the dancing is horribly broken from locking POV.  IRQ
and timeout handlers access block queue without grabbing the queue
lock and running the driver in SMP configuration crashes the whole
machine pretty quickly.

Remove meaningless irq enable/disable dancing and add proper locking
in issue, irq and timeout paths.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-04-28 07:36:56 +02:00
Bartlomiej Zolnierkiewicz 7090a0a97f mg_disk: fix CONFIG_LBD=y warning
drivers/block/mg_disk.c: In function ‘mg_dump_status’:
drivers/block/mg_disk.c:265: warning: format ‘%ld’ expects type ‘long int’, but
argument 2 has type ‘sector_t’

[ Impact: kill build warning ]

Cc: unsik Kim <donari75@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-04-28 07:36:56 +02:00
Tejun Heo ac2ff946a5 mg_disk: fix locking
IRQ and timeout handlers call functions which expect locked queue lock
without locking it.  Fix it.

While at it, convert 0s used as null pointer constant to NULLs.

[ Impact: fix locking, cleanup ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: unsik Kim <donari75@gmail.com>
2009-04-28 07:36:56 +02:00
Sage Weil f3c737de8f umem: fix request_queue lock warning
The umem driver issues two warnings on boot, due to blk_plug_device() and
blk_remove_plug() being called without q->queue_lock held.  Starting with
e48ec690 (block: extend queue_flag bitops), the queue_flag_* functions
warn if q->queue_lock doesn't appear to be locked.  In fact, q->queue_lock
is NULL (though that apparently isn't otherwise a problem as the driver is
using card->lock for everything).

Although blk_init_queue() with take a request_fn_proc and spinlock_t*,
there isn't a corresponding init helper that takes a make_request_fn.
Setting queue_lock to &card->lock explicitly seems to work fine for me.
The warning goes away and the device appears to behave.

[    1.531881] v2.3 : Micro Memory(tm) PCI memory board block driver
[    1.538136] umem 0000:02:01.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
[    1.545018] umem 0000:02:01.0: Micro Memory(tm) controller found (PCI Mem Module (Battery Backup))
[    1.554176] umem 0000:02:01.0: CSR 0xfc9ffc00 -> 0xffffc200013d0c00 (0x100)
[    1.561279] umem 0000:02:01.0: Size 1048576 KB, Battery 1 Disabled (FAILURE), Battery 2 Disabled (FAILURE)
[    1.571114] umem 0000:02:01.0: Window size 16777216 bytes, IRQ 20
[    1.577304] umem 0000:02:01.0: memory NOT initialized. Consider over-writing whole device.
[    1.585989]  umema:<4>------------[ cut here ]------------
[    1.591775] WARNING: at include/linux/blkdev.h:492 blk_plug_device+0x6d/0x106()
[    1.592025] Hardware name: H8SSL
[    1.592025] Modules linked in:
[    1.592025] Pid: 1, comm: swapper Not tainted 2.6.29 #8
[    1.592025] Call Trace:
[    1.592025]  [<ffffffff8023c994>] warn_slowpath+0xd3/0xf2
[    1.592025]  [<ffffffff8025a5b5>] ? save_trace+0x3f/0x9b
[    1.592025]  [<ffffffff8025a68b>] ? add_lock_to_list+0x7a/0xba
[    1.592025]  [<ffffffff8025e609>] ? validate_chain+0xb3b/0xce8
[    1.592025]  [<ffffffff80441556>] ? mm_make_request+0x27/0x59
[    1.592025]  [<ffffffff80441556>] ? mm_make_request+0x27/0x59
[    1.592025]  [<ffffffff8025ef04>] ? __lock_acquire+0x74e/0x7b9
[    1.592025]  [<ffffffff8025a70e>] ? get_lock_stats+0x34/0x5e
[    1.592025]  [<ffffffff8025a746>] ? put_lock_stats+0xe/0x27
[    1.592025]  [<ffffffff80441556>] ? mm_make_request+0x27/0x59
[    1.592025]  [<ffffffff803ad165>] blk_plug_device+0x6d/0x106
[    1.592025]  [<ffffffff80441575>] mm_make_request+0x46/0x59
[    1.592025]  [<ffffffff803ac2d9>] generic_make_request+0x335/0x3cf
[    1.592025]  [<ffffffff8027fcc7>] ? mempool_alloc_slab+0x11/0x13
[    1.592025]  [<ffffffff8027fdce>] ? mempool_alloc+0x45/0x101
[    1.592025]  [<ffffffff8025a746>] ? put_lock_stats+0xe/0x27
[    1.592025]  [<ffffffff803adda5>] submit_bio+0x10a/0x119
[    1.592025]  [<ffffffff802c8d00>] submit_bh+0xe5/0x109
[    1.592025]  [<ffffffff802cbf43>] block_read_full_page+0x2aa/0x2cb
[    1.592025]  [<ffffffff802cf4c4>] ? blkdev_get_block+0x0/0x4c
[    1.592025]  [<ffffffff805c90a8>] ? _spin_unlock_irq+0x36/0x51
[    1.592025]  [<ffffffff80286836>] ? __lru_cache_add+0x92/0xb2
[    1.592025]  [<ffffffff802cf008>] blkdev_readpage+0x13/0x15
[    1.592025]  [<ffffffff8027de06>] read_cache_page_async+0x90/0x134
[    1.592025]  [<ffffffff802ceff5>] ? blkdev_readpage+0x0/0x15
[    1.592025]  [<ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[    1.592025]  [<ffffffff8027deb8>] read_cache_page+0xe/0x45
[    1.592025]  [<ffffffff802f5170>] read_dev_sector+0x2e/0x93
[    1.592025]  [<ffffffff802f5f44>] adfspart_check_ICS+0x28/0x16c
[    1.592025]  [<ffffffff8025d427>] ? trace_hardirqs_on+0xd/0xf
[    1.592025]  [<ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[    1.592025]  [<ffffffff802f59c5>] rescan_partitions+0x168/0x2fb
[    1.592025]  [<ffffffff802ceae9>] __blkdev_get+0x259/0x336
[    1.592025]  [<ffffffff803ca1e2>] ? kobject_put+0x47/0x4b
[    1.592025]  [<ffffffff802cebd1>] blkdev_get+0xb/0xd
[    1.592025]  [<ffffffff802f5773>] register_disk+0xc4/0x12b
[    1.592025]  [<ffffffff803b2a7b>] add_disk+0xc3/0x12d
[    1.592025]  [<ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[    1.592025]  [<ffffffff808a1e73>] mm_init+0x129/0x1a5
[    1.592025]  [<ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[    1.592025]  [<ffffffff80209056>] _stext+0x56/0x130
[    1.592025]  [<ffffffff80274932>] ? register_irq_proc+0xae/0xca
[    1.592025]  [<ffffffff802f0000>] ? proc_pid_lookup+0xb4/0x18b
[    1.592025]  [<ffffffff8087f975>] kernel_init+0x132/0x18b
[    1.592025]  [<ffffffff8020d17a>] child_rip+0xa/0x20
[    1.592025]  [<ffffffff8020cb40>] ? restore_args+0x0/0x30
[    1.592025]  [<ffffffff8087f843>] ? kernel_init+0x0/0x18b
[    1.592025]  [<ffffffff8020d170>] ? child_rip+0x0/0x20
[    1.592025] ---[ end trace 7150b3b86da74e1e ]---
[    1.889858] ------------[ cut here ]------------[ve_plug+0x5f/0x91()
[    1.893848] Hardware name: H8SSL
[    1.893848] Modules linked in:
[    1.893848] Pid: 1, comm: swapper Tainted: G        W  2.6.29 #8
[    1.893848] Call Trace:
[    1.893848]  [<ffffffff8023c994>] warn_slowpath+0xd3/0xf2
[    1.893848]  [<ffffffff805c8411>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[    1.893848]  [<ffffffff8020cb40>] ? restore_args+0x0/0x30
[    1.893848]  [<ffffffff80254245>] ? __atomic_notifier_call_chain+0x0/0xb2
[    1.893848]  [<ffffffff805c90a3>] ? _spin_unlock_irq+0x31/0x51
[    1.893848]  [<ffffffff805c90bf>] ? _spin_unlock_irq+0x4d/0x51
[    1.893848]  [<ffffffff8044157d>] ? mm_make_request+0x4e/0x59
[    1.893848]  [<ffffffff8025a70e>] ? get_lock_stats+0x34/0x5e
[    1.893848]  [<ffffffff8025a75d>] ? put_lock_stats+0x25/0x27
[    1.893848]  [<ffffffff80441504>] ? mm_unplug_device+0x25/0x50
[    1.893848]  [<ffffffff803acf23>] blk_remove_plug+0x5f/0x91
[    1.893848]  [<ffffffff8044150f>] mm_unplug_device+0x30/0x50
[    1.893848]  [<ffffffff803ab74a>] blk_unplug+0x78/0x7d
[    1.893848]  [<ffffffff803ab75c>] blk_backing_dev_unplug+0xd/0xf
[    1.893848]  [<ffffffff802c853c>] block_sync_page+0x4a/0x4c
[    1.893848]  [<ffffffff8027da1c>] sync_page+0x44/0x4d
[    1.893848]  [<ffffffff805c66fd>] __wait_on_bit_lock+0x42/0x8a
[    1.893848]  [<ffffffff8027d9d8>] ? sync_page+0x0/0x4d
[    1.893848]  [<ffffffff8027d9c4>] __lock_page+0x64/0x6b
[    1.893848]  [<ffffffff802508db>] ? wake_bit_function+0x0/0x2a
[    1.893848]  [<ffffffff8027de4a>] read_cache_page_async+0xd4/0x134
[    1.893848]  [<ffffffff802ceff5>] ? blkdev_readpage+0x0/0x15
[    1.893848]  [<ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[    1.893848]  [<ffffffff8027deb8>] read_cache_page+0xe/0x45
[    1.893848]  [<ffffffff802f5170>] read_dev_sector+0x2e/0x93
[    1.893848]  [<ffffffff802f5f44>] adfspart_check_ICS+0x28/0x16c
[    1.893848]  [<ffffffff8025d427>] ? trace_hardirqs_on+0xd/0xf
[    1.893848]  [<ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[    1.893848]  [<ffffffff802f59c5>] rescan_partitions+0x168/0x2fb
[    1.893848]  [<ffffffff802ceae9>] __blkdev_get+0x259/0x336
[    1.893848]  [<ffffffff803ca1e2>] ? kobject_put+0x47/0x4b
[    1.893848]  [<ffffffff802cebd1>] blkdev_get+0xb/0xd
[    1.893848]  [<ffffffff802f5773>] register_disk+0xc4/0x12b
[    1.893848]  [<ffffffff803b2a7b>] add_disk+0xc3/0x12d
[    1.893848]  [<ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[    1.893848]  [<ffffffff808a1e73>] mm_init+0x129/0x1a5
[    1.893848]  [<ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[    1.893848]  [<ffffffff80209056>] _stext+0x56/0x130
[    1.893848]  [<ffffffff80274932>] ? register_irq_proc+0xae/0xca
[    1.893848]  [<ffffffff802f0000>] ? proc_pid_lookup+0xb4/0x18b
[    1.893848]  [<ffffffff8087f975>] kernel_init+0x132/0x18b
[    1.893848]  [<ffffffff8020d17a>] child_rip+0xa/0x20
[    1.893848]  [<ffffffff8020cb40>] ? restore_args+0x0/0x30
[    1.893848]  [<ffffffff8087f843>] ? kernel_init+0x0/0x18b
[    1.893848]  [<ffffffff8020d170>] ? child_rip+0x0/0x20
[    1.893848] ---[ end trace 7150b3b86da74e1f ]---

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-24 08:54:21 +02:00
David Vrabel 3444b26afa USB: add reset endpoint operations
Wireless USB endpoint state has a sequence number and a current
window and not just a single toggle bit.  So allow HCDs to provide a
endpoint_reset method and call this or clear the software toggles as
required (after a clear halt, set configuration etc.).

usb_settoggle() and friends are then HCD internal and are moved into
core/hcd.h and all device drivers call usb_reset_endpoint() instead.

If the device endpoint state has been reset (with a clear halt) but
the host endpoint state has not then subsequent data transfers will
not complete. The device will only work again after it is reset or
disconnected.

Signed-off-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-17 10:50:27 -07:00
Nick Piggin c2572f2b4f brd: fix cacheflushing
brd is missing a flush_dcache_page. On 2nd thoughts, perhaps it is the
pagecache's responsibility to flush user virtual aliases (the driver of
course should flush kernel virtual mappings)... but anyway, there
already exists cache flushing for one direction of transfer, so we
should add the other.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15 12:10:13 +02:00
Nick Piggin dfbc4752ea brd: support barriers
brd is always ordered (not that it matters, as it is defined not to
survive when the system goes down). So tell the block layer it is
ordered, which might be of help with testing filesystems.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15 12:10:13 +02:00
Yang Hongyang e930438c42 Replace all DMA_nBIT_MASK macro with DMA_BIT_MASK(n)
This is the second go through of the old DMA_nBIT_MASK macro,and there're not
so many of them left,so I put them into one patch.I hope this is the last round.
After this the definition of the old DMA_nBIT_MASK macro could be removed.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Tony Lindgren <tony@atomide.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Greg KH <greg@kroah.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 15:04:33 -07:00
Grant Likely f0edef8c8b xsysace: Fix dereferencing of cf_id after hd_driveid removal
Commit 4aaf2fec71 (xsysace: make it
'struct hd_driveid'-free) converted the cf_id member of 'struct
ace_device' from a 'struct hd_driveid' to a u16 array.  However,
references to the base of the structure were still using the '&'
operator.  When the address was used with the ata_id_u32() macro, the
compiler used the size of the entire array instead of sizeof(u16) to
calculate the offset from the base address.

This patch removes the use of the '&' operator from all references of
cf_id to fix the bug and remove future confusion.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-04-08 14:13:04 +02:00
Linus Torvalds 6a5d263866 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  loop: mutex already unlocked in loop_clr_fd()
  cfq-iosched: don't let idling interfere with plugging
  block: remove unused REQ_UNPLUG
  cfq-iosched: kill two unused cfqq flags
  cfq-iosched: change dispatch logic to deal with single requests at the time
  mflash: initial support
  cciss: change to discover first memory BAR
  cciss: kernel scan thread for MSA2012
  cciss: fix residual count for block pc requests
  block: fix inconsistency in I/O stat accounting code
  block: elevator quiescing helpers
2009-04-07 11:06:41 -07:00
Yang Hongyang 284901a90a dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:11 -07:00
Yang Hongyang 6a35528a83 dma-mapping: replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:10 -07:00
Alexander Beregalov ffcd7dca3a loop: mutex already unlocked in loop_clr_fd()
mount/1865 is trying to release lock (&lo->lo_ctl_mutex) at:
but there are no more locks to release!

mutex is already unlocked in loop_clr_fd(), we should not
try to unlock it in lo_release() again.

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-07 13:48:21 +02:00
unsik Kim 3fbed4c61a mflash: initial support
This driver supports mflash IO mode for linux.

Mflash is embedded flash drive and mainly targeted mobile and consumer
electronic devices.

Internally, mflash has nand flash and other hardware logics and supports 2
different operation (ATA, IO) modes.  ATA mode doesn't need any new driver
and currently works well under standard IDE subsystem.  Actually it's one
chip SSD.  IO mode is ATA-like custom mode for the host that doesn't have
IDE interface.

Followings are brief descriptions about IO mode.
A. IO mode based on ATA protocol and uses some custom command. (read confirm,
write confirm)
B. IO mode uses SRAM bus interface.
C. IO mode supports 4kB boot area, so host can boot from mflash.

This driver is quitely similar to a standard ATA driver, but because of
following reasons it is currently seperated with ATA layer.

1. ATA layer deals standard ATA protocol.  ATA layer have many low-
   level device specific interface, but data transfer keeps ATA rule.
   But, mflash IO mode doesn't.

2. Even though currently not used in mflash driver code, mflash has
   some custom command and modes.  (nand fusing, firmware patch, etc) If
   this feature supported in linux kernel, ATA layer more altered.

3. Currently PATA platform device driver doesn't support interrupt.
   (I'm not sure) But, mflash uses interrupt (polling mode is just for
   debug).

4. mflash is somewhat under-develop product.  Even though some company
   already using mflash their own product, I think more time is needed for
   standardization of custom command and mode.  That time (maybe October)
   I will talk to with ATA people.  If they accept integration, I will
   integrate.

Signed-off-by: unsik Kim <donari75@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-07 08:12:38 +02:00
Mike Miller e143858104 cciss: change to discover first memory BAR
Add a method for discovering the first memory BAR.  All Smart Array
controllers to date have always had the the memory BAR as the first BAR.
A new controller to be released later this year breaks that model.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-07 08:12:38 +02:00
Mike Miller 0a9279cc7c cciss: kernel scan thread for MSA2012
The MSA2012 cannot inform the driver of configuration changes since all
management is out of band.  This is a departure from any storage we have
supported in the past.  We need some way to detect changes on the topology
so we implement this kernel thread.  In some instances there's nothing we
can do from the driver (like LUN failure) so just print out a message.  In
the case where logical volumes are added or deleted we call
rebuild_lun_table to refresh the driver's view of the world.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-07 08:12:38 +02:00
Jens Axboe ac44e5b2ed cciss: fix residual count for block pc requests
We must complete the full request, so store the request count and then set
the ->data_len to the residual count from the hardware.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-07 08:12:38 +02:00
Linus Torvalds ea02259fdf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/linux-hdreg-h-cleanup
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/linux-hdreg-h-cleanup:
  remove <linux/ata.h> include from <linux/hdreg.h>
  include/linux/hdreg.h: remove unused defines
  isd200: use ATA_* defines instead of *_STAT and *_ERR ones
  include/linux/hdreg.h: cover WIN_* and friends with #ifndef/#endif __KERNEL__
  aoe: WIN_* -> ATA_CMD_*
  isd200: WIN_* -> ATA_CMD_*
  include/linux/hdreg.h: cover struct hd_driveid with #ifndef/#endif __KERNEL__
  xsysace: make it 'struct hd_driveid'-free
  ubd_kern: make it 'struct hd_driveid'-free
  isd200: make it 'struct hd_driveid'-free
2009-04-03 09:02:32 -07:00
Pavel Machek 15746fcaa3 nbd: trivial cleanups
Trivial cleanups for nbd: only the return -EIO one really changes code,
and I've verified all the callers (plus 0 == success, 1 == error
convention is really ugly).

Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:02 -07:00
Pavel Machek 1a2ad21128 nbd: add locking to nbd_ioctl
The code was written to rely on big kernel lock to protect it from races.
It mostly works when interface is not abused.

So this uses tx_lock to protect data structures from concurrent use
between ioctl and worker threads.

Next step will be moving from ioctl to unlocked_ioctl.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: add missing return]
Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Paul Clements <paul.clements@steeleye.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:02 -07:00
Scott James Remnant 83f9ef463b floppy: provide a PNP device table in the module.
The missing device table means that the floppy module is not auto-loaded,
even when the appropriate PNP device (0700) is found.

We don't actually use the table in the module, since the device doesn't
have a struct pnp_driver, but it's sufficient to cause an alias in the
module that udev/modprobe will use.

Signed-off-by: Scott James Remnant <scott@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Philippe De Muyter <phdm@macqel.be>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:04:49 -07:00
Bartlomiej Zolnierkiewicz 4fe6e30645 include/linux/hdreg.h: remove unused defines
* Move HD_IRQ define to drivers/block/hd.c (only user).

* Remove unused *_STAT, *_ERR, HD_*, CD, IO, REL and TAG_MASK defines.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-04-01 21:42:25 +02:00
Bartlomiej Zolnierkiewicz 04b3ab52a0 aoe: WIN_* -> ATA_CMD_*
* Use ATA_CMD_* defines instead of WIN_* ones.

* Include <linux/ata.h> directly instead of through <linux/hdreg.h>.

Cc: Ed L. Cashin <ecashin@coraid.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-04-01 21:42:24 +02:00
Bartlomiej Zolnierkiewicz 4aaf2fec71 xsysace: make it 'struct hd_driveid'-free
* Change cf_id field in struct ace_device from 'struct hd_driveid *id'
  to 'u16 *id' and update driver accordingly.

* Include <linux/ata.h> directly instead of through <linux/hdreg.h>.

While at it:

* Use ata_id_u32() macro.

There should be no functional changes caused by this patch.

Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-04-01 21:42:22 +02:00
J. R. Okajima 53d6660836 loop: add ioctl to resize a loop device
Add the ability to 'resize' the loop device on the fly.

One practical application is a loop file with XFS filesystem, already
mounted: You can easily enlarge the file (append some bytes) and then call
ioctl(fd, LOOP_SET_CAPACITY, new); The loop driver will learn about the
new size and you can use xfs_growfs later on, which will allow you to use
full capacity of the loop file without the need to unmount.

Test app:

#include <linux/fs.h>
#include <linux/loop.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define _GNU_SOURCE
#include <getopt.h>

char *me;

void usage(FILE *f)
{
	fprintf(f, "%s [options] loop_dev [backend_file]\n"
		"-s, --set new_size_in_bytes\n"
		"\twhen backend_file is given, "
		"it will be expanded too while keeping the original contents\n",
		me);
}

struct option opts[] = {
	{
		.name		= "set",
		.has_arg	= 1,
		.flag		= NULL,
		.val		= 's'
	},
	{
		.name		= "help",
		.has_arg	= 0,
		.flag		= NULL,
		.val		= 'h'
	}
};

void err_size(char *name, __u64 old)
{
	fprintf(stderr, "size must be larger than current %s (%llu)\n",
		name, old);
}

int main(int argc, char *argv[])
{
	int fd, err, c, i, bfd;
	ssize_t ssz;
	size_t sz;
	__u64 old, new, append;
	char a[BUFSIZ];
	struct stat st;
	FILE *out;
	char *backend, *dev;

	err = EINVAL;
	out = stderr;
	me = argv[0];
	new = 0;
	while ((c = getopt_long(argc, argv, "s:h", opts, &i)) != -1) {
		switch (c) {
		case 's':
			errno = 0;
			new = strtoull(optarg, NULL, 0);
			if (errno) {
				err = errno;
				perror(argv[i]);
				goto out;
			}
			break;

		case 'h':
			err = 0;
			out = stdout;
			goto err;

		default:
			perror(argv[i]);
			goto err;
		}
	}

	if (optind < argc)
		dev = argv[optind++];
	else
		goto err;

	fd = open(dev, O_RDONLY);
	if (fd < 0) {
		err = errno;
		perror(dev);
		goto out;
	}

	err = ioctl(fd, BLKGETSIZE64, &old);
	if (err) {
		err = errno;
		perror("ioctl BLKGETSIZE64");
		goto out;
	}

	if (!new) {
		printf("%llu\n", old);
		goto out;
	}

	if (new < old) {
		err = EINVAL;
		err_size(dev, old);
		goto out;
	}

	if (optind < argc) {
		backend = argv[optind++];
		bfd = open(backend, O_WRONLY|O_APPEND);
		if (bfd < 0) {
			err = errno;
			perror(backend);
			goto out;
		}
		err = fstat(bfd, &st);
		if (err) {
			err = errno;
			perror(backend);
			goto out;
		}
		if (new < st.st_size) {
			err = EINVAL;
			err_size(backend, st.st_size);
			goto out;
		}
		append = new - st.st_size;
		sz = sizeof(a);
		while (append > 0) {
			if (append < sz)
				sz = append;
			ssz = write(bfd, a, sz);
			if (ssz != sz) {
				err = errno;
				perror(backend);
				goto out;
			}
			append -= sz;
		}
		err = fsync(bfd);
		if (err) {
			err = errno;
			perror(backend);
			goto out;
		}
	}

	err = ioctl(fd, LOOP_SET_CAPACITY, new);
	if (err) {
		err = errno;
		perror("ioctl LOOP_SET_CAPACITY");
	}
	goto out;

 err:
	usage(out);
 out:
	return err;
}

Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: Tomas Matejicek <tomas@slax.org>
Cc: <util-linux-ng@vger.kernel.org>
Cc: Karel Zak <kzak@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <linux-api@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:17 -07:00
Alexey Dobriyan 99b7623380 proc 2/2: remove struct proc_dir_entry::owner
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:44 +04:00
Linus Torvalds 4496d937a5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: irq_node.handler() should return irqreturn_t
  m68k: section mismatch fixes: Atari SCSI
  m68k: section mismatch fixes: DMAsound for Atari
  MAINTAINERS: Replace dead link to m68k CVS repository by link to new git repository
  m68k: mac - Add SWIM floppy support
  m68k: mac - Add a new entry in mac_model to identify the floppy controller type.
  m68k: Add install target
2009-03-26 16:15:31 -07:00
Linus Torvalds 86d9c07017 Merge branch 'for-2.6.30' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.30' of git://git.kernel.dk/linux-2.6-block:
  Get rid of pdflush_operation() in emergency sync and remount
  btrfs: get rid of current_is_pdflush() in btrfs_btree_balance_dirty
  Move the default_backing_dev_info out of readahead.c and into backing-dev.c
  block: Repeated lines in switching-sched.txt
  bsg: Remove bogus check against request_queue->max_sectors
  block: WARN in __blk_put_request() for potential bio leak
  loop: fix circular locking in loop_clr_fd()
  loop: support barrier writes
  bsg: add support for tail queuing
  cpqarray: enable bus mastering
  block: genhd.h cleanup patch
  block: add private bio_set for bio integrity allocations
  block: genhd.h comment needs updating
  block: get rid of unused blkdev_free_rq() define
  block: remove various blk_queue_*() setting functions in blk_init_queue_node()
  cciss: add BUILD_BUG_ON() for catching bad CommandList_struct alignment
  block: don't create bio_vec slabs of less than the inline number
  block: cleanup bio_alloc_bioset()
2009-03-26 16:03:04 -07:00
David S. Miller 08abe18af1 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/wimax/i2400m/usb-notif.c
2009-03-26 15:23:24 -07:00
Laurent Vivier 8852ecd974 m68k: mac - Add SWIM floppy support
It allows to read data from a floppy, but not to write to, and to eject the
floppy (useful on our Mac without eject button).

Signed-off-by: Laurent Vivier <Laurent@lvivier.info>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-03-26 21:15:27 +01:00
Linus Torvalds 61a091827e Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (97 commits)
  USB: qcserial: add device id for HP devices
  USB: isp1760: Add a delay before reading the SKIPMAP registers in isp1760-hcd.c
  USB: allow malformed LANGID descriptors
  USB: pxa27x_udc: typo fixes and code cleanups
  USB: gadget: gadget zero uses new suspend/resume hooks
  USB: gadget: composite device-level suspend/resume hooks
  USB: r8a66597-hcd: suspend/resume support
  USB: more u32 conversion after transfer_buffer_length and actual_length
  USB: Fix cp2101 USB serial device driver termios functions for console use
  USB: CP2101 New Device ID
  USB: ipaq: handle 4 endpoint devices
  USB: S3C: Move usb-control.h to platform include
  USB: ohci-hcd: Add ARCH_S3C24XX to the ohci-s3c2410.c glue
  USB: pedantic: spelling correction in comment for ch9.h
  USB: host: fix sparse warning: Using plain integer as NULL pointer
  USB: ohci-s3c2410: fix name of bus clock
  USB: ohci-s3c2410: remove <mach/hardware.h> include
  USB: serial: rename cp2101 driver to cp210x
  USB: CP2101 Reduce Error Logging
  USB: CP2101 Support AN205 baud rates
  ...
2009-03-26 11:17:39 -07:00
Nikanth Karthikesan f028f3b2f9 loop: fix circular locking in loop_clr_fd()
With CONFIG_PROVE_LOCKING enabled

$ losetup /dev/loop0 file
$ losetup -o 32256 /dev/loop1 /dev/loop0

$ losetup -d /dev/loop1
$ losetup -d /dev/loop0

triggers a [ INFO: possible circular locking dependency detected ]

I think this warning is a false positive.

Open/close on a loop device acquires bd_mutex of the device before
acquiring lo_ctl_mutex of the same device. For ioctl(LOOP_CLR_FD) after
acquiring lo_ctl_mutex, fput on the backing_file might acquire the bd_mutex of
a device, if backing file is a device and this is the last reference to the
file being dropped . But it is guaranteed that it is impossible to have a
circular list of backing devices.(say loop2->loop1->loop0->loop2 is not
possible), which guarantees that this can never deadlock.

So this warning should be suppressed. It is very difficult to annotate lockdep
not to warn here in the correct way. A simple way to silence lockdep could be
to mark the lo_ctl_mutex in ioctl to be a sub class, but this might mask some
other real bugs.

@@ -1164,7 +1164,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
 	struct loop_device *lo = bdev->bd_disk->private_data;
 	int err;

-	mutex_lock(&lo->lo_ctl_mutex);
+	mutex_lock_nested(&lo->lo_ctl_mutex, 1);
 	switch (cmd) {
 	case LOOP_SET_FD:
 		err = loop_set_fd(lo, mode, bdev, arg);

Or actually marking the bd_mutex after lo_ctl_mutex as a sub class could be
a better solution.

Luckily it is easy to avoid calling fput on backing file with lo_ctl_mutex
held, so no lockdep annotation is required.

If you do not like the special handling of the lo_ctl_mutex just for the
LOOP_CLR_FD ioctl in lo_ioctl(), the mutex handling could be moved inside
each of the individual ioctl handlers and I could send you another patch.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-26 11:01:19 +01:00
Eric Miao 71b3e0c1ad platform: make better use of to_platform_{device,driver}() macros
This helps the code look more consistent and cleaner.

Signed-off-by: Eric Miao <eric.miao@marvell.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:38:24 -07:00
Alan Stern e6e244b6cb usb-storage: prepare for subdriver separation
This patch (as1206) is the first step in converting usb-storage's
subdrivers into separate modules.  It makes the following large-scale
changes:

	Remove a bunch of unnecessary #ifdef's from usb_usual.h.
	Not truly necessary, but it does clean things up.

	Move the USB device-ID table (which is duplicated between
	libusual and usb-storage) into its own source file,
	usual-tables.c, and arrange for this to be linked with
	either libusual or usb-storage according to whether
	USB_LIBUSUAL is configured.

	Add to usual-tables.c a new usb_usual_ignore_device()
	function to detect whether a particular device needs to be
	managed by a subdriver and not by the standard handlers
	in usb-storage.

	Export a whole bunch of functions in usb-storage, renaming
	some of them because their names don't already begin with
	"usb_stor_".  These functions will be needed by the new
	subdriver modules.

	Split usb-storage's probe routine into two functions.
	The subdrivers will call the probe1 routine, then fill in
	their transport and protocol settings, and then call the
	probe2 routine.

	Take the default cases and error checking out of
	get_transport() and get_protocol(), which run during
	probe1, and instead put a check for invalid transport
	or protocol values into the probe2 function.

	Add a new probe routine to be used for standard devices,
	i.e., those that don't need a subdriver.  This new routine
	checks whether the device should be ignored (because it
	should be handled by ub or by a subdriver), and if not,
	calls the probe1 and probe2 functions.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:34 -07:00
Julia Lawall db5e6df172 USB: ub: use USB API functions rather than constants
This set of patches introduces calls to the following set of functions:

usb_endpoint_dir_in(epd)
usb_endpoint_dir_out(epd)
usb_endpoint_is_bulk_in(epd)
usb_endpoint_is_bulk_out(epd)
usb_endpoint_is_int_in(epd)
usb_endpoint_is_int_out(epd)
usb_endpoint_num(epd)
usb_endpoint_type(epd)
usb_endpoint_xfer_bulk(epd)
usb_endpoint_xfer_control(epd)
usb_endpoint_xfer_int(epd)
usb_endpoint_xfer_isoc(epd)

In some cases, introducing one of these functions is not possible, and it
just replaces an explicit integer value by one of the following constants:

USB_ENDPOINT_XFER_BULK
USB_ENDPOINT_XFER_CONTROL
USB_ENDPOINT_XFER_INT
USB_ENDPOINT_XFER_ISOC

An extract of the semantic patch that makes these changes is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r1@ struct usb_endpoint_descriptor *epd; @@

- ((epd->bmAttributes & \(USB_ENDPOINT_XFERTYPE_MASK\|3\)) ==
- \(USB_ENDPOINT_XFER_CONTROL\|0\))
+ usb_endpoint_xfer_control(epd)

@r5@ struct usb_endpoint_descriptor *epd; @@

- ((epd->bEndpointAddress & \(USB_ENDPOINT_DIR_MASK\|0x80\)) ==
-  \(USB_DIR_IN\|0x80\))
+ usb_endpoint_dir_in(epd)

@inc@
@@

#include <linux/usb.h>

@depends on !inc && (r1||r5)@
@@

+ #include <linux/usb.h>
  #include <linux/usb/...>
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-03-24 16:20:27 -07:00
Nikanth Karthikesan 68db1961bb loop: support barrier writes
Honour barrier requests in the loop back block device driver.
In case of barrier bios, flush the backing file once before processing the
barrier and once after to guarantee ordering. In case of filesystems that
does not support fsync, barrier bios would be failed with -EOPNOTSUPP.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-24 12:35:18 +01:00
Dave Jones 0061d38642 cpqarray: enable bus mastering
We've been carrying this patch for the last 3 years in Fedora,
long past time we got it upstream...

Call pci_set_master to enable bus-mastering if the BIOS hasn't
done it already.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-24 12:35:17 +01:00
Jens Axboe 10cbda97e7 cciss: add BUILD_BUG_ON() for catching bad CommandList_struct alignment
The hardware requires 64-bit alignment of commands, so add a build bug
check for that. The recent commit 8a3173de4a
didn't change the size of the command, but other additions/changes may and
thus break badly at runtime.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-24 12:35:16 +01:00
Benjamin Herrenschmidt c71327ad9f Merge commit 'gcl/merge' into merge 2009-03-18 13:16:30 +11:00
Grant Likely bfbd442f69 Fix Xilinx SystemACE driver to handle empty CF slot
The SystemACE driver does not handle an empty CF slot gracefully. An
empty CF slot ends up hanging the system. This patch adds a check for
the CF state and stops trying to process requests if the slot is empty.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-14 21:06:52 +01:00
Geert Uytterhoeven f507cd2203 ps3/block: Replace mtd/ps3vram by block/ps3vram
Convert the PS3 Video RAM Storage Driver from an MTD driver to a plain block
device driver.

The ps3vram driver exposes unused video RAM on the PS3 as a block device
suitable for storage or swap.  Fast data transfer is achieved using a local
cache in system RAM and DMA transfers via the GPU.

The new driver is ca. 50% faster for reading, and ca. 10% for writing.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-13 16:07:19 +11:00
Stephen Hemminger 7546dd97d2 net: convert usage of packet_type to read_mostly
Protocols that use packet_type can be __read_mostly section for better
locality. Elminate any unnecessary initializations of NULL.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-10 05:22:43 -07:00
Linus Torvalds df0b4a5080 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
  p54: fix race condition in memory management
  cfg80211: test before subtraction on unsigned
  iwlwifi: fix error flow in iwl*_pci_probe
  rt2x00 : more devices to rt73usb.c
  rt2x00 : more devices to rt2500usb.c
  bonding: Fix device passed into ->ndo_neigh_setup().
  vlan: Fix vlan-in-vlan crashes.
  net: Fix missing dev->neigh_setup in register_netdevice().
  tmspci: fix request_irq race
  pkt_sched: act_police: Fix a rate estimator test.
  tg3: Fix 5906 link problems
  SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails
  IPv6: add "disable" module parameter support to ipv6.ko
  sungem: another error printed one too early
  aoe: error printed 1 too early
  net pcmcia: worklimit reaches -1
  net: more timeouts that reach -1
  net: fix tokenring license
  dm9601: new vendor/product IDs
  netlink: invert error code in netlink_set_err()
  ...
2009-03-09 09:15:40 -07:00
Roel Kluin a3941ec101 loop: don't increment p->offset with (size_t) -EINVAL
Upon a 'transfer error block' size is set to -EINVAL, but this becomes positive
since size is unsigned: p->offset still gets incremented.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-05 12:04:57 +01:00
Jens Axboe 5e18cfd04f cciss: remove 30 second initial timeout on controller reset
Commit 5e4c91c84b forgot to remove the
initial sleep, get rid of it.

Thanks to Randy Dunlap <randy.dunlap@oracle.com> for spotting this error.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-05 12:04:57 +01:00
Kris Shannon a1a15ac5f9 Fix kernel NULL pointer dereference in xen-blkfront
When booting Xen Dom0 on a pre-release 3.2.1 hypervisor the system Oopses on a
"Unable to handle kernel NULL pointer dereference" in xenwatch.

From the backtrace it looks like backend_changed is calling bdget_disk
with a NULL pointer.  Checking for NULL and returning ENODEV instead
allows the kernel to boot.
2009-03-05 12:04:57 +01:00
Roel Kluin 9487311157 aoe: error printed 1 too early
with while (i-- > 0); i reaches -1 after the loop, so the test below is printed
one too early: 0 still means success.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04 00:11:52 -08:00
Jens Axboe 9e973e64ac xen/blkfront: use blk_rq_map_sg to generate ring entries
On occasion, the request will apparently have more segments than we
fit into the ring. Jens says:

> The second problem is that the block layer then appears to create one
> too many segments, but from the dump it has rq->nr_phys_segments ==
> BLKIF_MAX_SEGMENTS_PER_REQUEST. I suspect the latter is due to
> xen-blkfront not handling the merging on its own. It should check that
> the new page doesn't form part of the previous page. The
> rq_for_each_segment() iterates all single bits in the request, not dma
> segments. The "easiest" way to do this is to call blk_rq_map_sg() and
> then iterate the mapped sg list. That will give you what you are
> looking for.

> Here's a test patch, compiles but otherwise untested. I spent more
> time figuring out how to enable XEN than to code it up, so YMMV!
> Probably the sg list wants to be put inside the ring and only
> initialized on allocation, then you can get rid of the sg on stack and
> sg_init_table() loop call in the function. I'll leave that, and the
> testing, to you.

[Moved sg array into info structure, and initialize once. -J]

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-26 10:45:48 +01:00
Jens Axboe 5e4c91c84b cciss: shorten 30s timeout on controller reset
If reset_devices is set for kexec, then cciss will delay 30 seconds
since the old 5i controller _may_ need that long to recover. Replace
the long sleep with incremental sleep and tests to reduce the 30 seconds
to worst case for 5i, so that other controllers will proceed quickly.

Reviewed-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-26 10:45:48 +01:00
Geert Uytterhoeven 3d92e8f3ae m68k: atari - Rename "mfp" to "st_mfp"
http://kisskb.ellerman.id.au/kisskb/buildresult/72115/:
| net/mac80211/ieee80211_i.h:327: error: syntax error before 'volatile'
| net/mac80211/ieee80211_i.h:350: error: syntax error before '}' token
| net/mac80211/ieee80211_i.h:455: error: field 'sta' has incomplete type
| distcc[19430] ERROR: compile net/mac80211/main.c on sprygo/32 failed

This is caused by

| # define mfp ((*(volatile struct MFP*)MFP_BAS))

in arch/m68k/include/asm/atarihw.h, which conflicts with the new "mfp" enum in
net/mac80211/ieee80211_i.h.

Rename "mfp" to "st_mfp", as it's a way too generic name for a global #define.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-22 09:23:02 -08:00
Linus Torvalds ba95fd47d1 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list
  block: fix booting from partitioned md array
  block: revert part of 18ce3751cc
  cciss: PCI power management reset for kexec
  paride/pg.c: xs(): &&/|| confusion
  fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free
  block: fix bad definition of BIO_RW_SYNC
  bsg: Fix sense buffer bug in SG_IO
2009-02-18 18:33:04 -08:00
Philippe De Muyter 5a74db06cc floppy: request and release only the ports we actually use
The floppy driver requests an I/O port it doesn't need, and sometimes this
causes a conflict with a motherboard device reported by PNPBIOS.

This patch makes the floppy driver request and release only the ports it
actually uses.  It also factors out the request/release stuff and the
io-ports list so they're all in one place now.

The current floppy driver uses only these ports:

    0x3f2 (FD_DOR)
    0x3f4 (FD_STATUS)
    0x3f5 (FD_DATA)
    0x3f7 (FD_DCR/FD_DIR)

but it requests 0x3f2-0x3f5 and 0x3f7, which includes the unused port
0x3f3.

Some BIOSes report 0x3f3 as a motherboard resource.  The PNP system driver
reserves that, which causes a conflict when the floppy driver requests
0x3f2-0x3f5 later.

Philippe reported that this conflict broke the floppy driver between
2.6.11 and 2.6.22.  His PNPBIOS reports these devices:

    $ cat 00:07/id 00:07/resources	# motherboard device
    PNP0c02
    state = active
    io 0x80-0x80
    io 0x10-0x1f
    io 0x22-0x3f
    io 0x44-0x5f
    io 0x90-0x9f
    io 0xa2-0xbf
    io 0x3f0-0x3f1
    io 0x3f3-0x3f3

    $ cat 00:03/id 00:03/resources	# floppy device
    PNP0700
    state = active
    io 0x3f4-0x3f5
    io 0x3f2-0x3f2

Reference:
    http://lkml.org/lkml/2009/1/31/162

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Reported-by: Philippe De Muyter <phdm@macqel.be>
Tested-by: Philippe De Muyter <phdm@macqel.be>
Cc: Adam M Belay <abelay@mit.edu>
Cc: Robert Hancock <hancockrwd@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:55 -08:00
Ed Cashin b6d6c51758 aoe: ignore vendor extension AoE responses
The Welland ME-747K-SI AoE target generates unsolicited AoE responses that
are marked as vendor extensions.  Instead of ignoring these packets, the
aoe driver was generating kernel messages for each unrecognized response
received.  This patch corrects the behavior.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Reported-by: <karaluh@karaluh.pl>
Tested-by: <karaluh@karaluh.pl>
Cc: <stable@kernel.org>
Cc: Alex Buell <alex.buell@munted.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-18 15:37:53 -08:00
Chip Coldwell 82eb03cfd8 cciss: PCI power management reset for kexec
The kexec kernel resets the CCISS hardware in three steps:

1. Use PCI power management states to reset the controller in the
   kexec kernel.

2. Clear the MSI/MSI-X bits in PCI configuration space so that MSI
   initialization in the kexec kernel doesn't fail.

3. Use the CCISS "No-op" message to determine when the controller
   firmware has recovered from the PCI PM reset.

[akpm@linux-foundation.org: cleanups]
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-18 10:32:01 +01:00
Roel Kluin c8cbec6bdf paride/pg.c: xs(): &&/|| confusion
&&/|| confusion

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-18 10:32:01 +01:00
Paul Clements 4d48a542b4 nbd: fix I/O hang on disconnected nbds
Fix a problem that causes I/O to a disconnected (or partially initialized)
nbd device to hang indefinitely.  To reproduce:

# ioctl NBD_SET_SIZE_BLOCKS /dev/nbd23 514048
# dd if=/dev/nbd23 of=/dev/null bs=4096 count=1

...hangs...

This can also occur when an nbd device loses its nbd-client/server
connection.  Although we clear the queue of any outstanding I/Os after the
client/server connection fails, any additional I/Os that get queued later
will hang.

This bug may also be the problem reported in this bug report:
http://bugzilla.kernel.org/show_bug.cgi?id=12277

Testing would need to be performed to determine if the two issues are the
same.

This problem was introduced by the new request handling thread code ("NBD:
allow nbd to be used locally", 3/2008), which entered into mainline around
2.6.25.

The fix, which is fairly simple, is to restore the check for lo->sock
being NULL in do_nbd_request.  This causes I/O to an uninitialized nbd to
immediately fail with an I/O error, as it did prior to the introduction of
this bug.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Reported-by: Jon Nelson <jnelson-kernel-bugzilla@jamponi.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <stable@kernel.org>		[2.6.26.x, 2.6.27.x, 2.6.28.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-11 14:25:37 -08:00
Stephen Rothwell e377c6e24d powerpc/ps3: Printing fixups for l64 to ll64 conversion drivers/block
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-16 16:15:13 +11:00
Pavel Machek c91192d66d nbd: do not allow two clients at the same time
Two nbd-clients at same time are bad idea, and cause WARN_ON from nbd in
2.6.28-rc7 from sysfs_add_one.  This simply prevents that from happening.

To reproduce:

 cat /dev/zero | head -c 10000000 > /tmp/delme.fstest.fs
 nbd-server 9100 -l /anyone.can.connect > /tmp/delme.fstest.fs &
 sleep 1
 nbd-client localhost 9100 /dev/nd0 &
 nbd-client localhost 9100 /dev/nd0 &

Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-15 16:39:38 -08:00
Benjamin Herrenschmidt bd1f7936ab Merge commit 'gcl/gcl-next' into next 2009-01-13 13:59:11 +11:00
Andreas Bombe 6d0be946e1 m68k: amiflop - Get rid of sleep_on calls
Apart from sleep_on() calls that could be easily converted to
wait_event() and completion calls amiflop also used a flag in ms_delay()
and ms_isr() as a custom mutex for ms_delay() without a need for
explicit unlocking.  I converted that to a standard mutex.

The replacement for the unconditional sleep_on() in fd_motor_on() is a
complete_all() together with a INIT_COMPLETION() before the mod_timer()
call.  It appears to me that fd_motor_on() might be called concurrently
and fd_select() does not guarantee mutual exclusivity in the case the
same drive gets selected again.

Signed-off-by: Andreas Bombe <aeb@debian.org>
Acked-by: Jörg Dorchain <joerg@dorchain.net>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2009-01-12 20:56:33 +01:00
Yuri Tikhonov f5020384e4 powerpc/xsysace: add compatible string for non-ipcore instance
Add "xlnx,sysace" compatible string to the of_platform binding
table.  Platforms which have the SysACE chip on board (e.g.
Katmai) instead of via a Xilinx generated IP core will use
this value in their device tree.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-01-09 15:49:06 -07:00
Linus Torvalds 9e42d0cf50 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Work around branch tracer warning.
  sparc64: Fix unsigned long long warnings in drivers.
  sparc64: Use unsigned long long for u64.
  sparc: refactor code in fault_32.c
  sparc64: refactor code in init_64.c
  sparc64: refactor code in viohs.c
  sparc: make proces_ver_nack a bit more readable
2009-01-07 17:23:53 -08:00
Alan Stern 011b15df46 USB: change interface to usb_lock_device_for_reset()
This patch (as1161) changes the interface to
usb_lock_device_for_reset().  The existing interface is apparently not
very clear, judging from the fact that several of its callers don't
use it correctly.  The new interface always returns 0 for success and
it always requires the caller to unlock the device afterward.

The new routine will not return immediately if it is called while the
driver's probe method is running.  Instead it will wait until the
probe is over and the device has been unlocked.  This shouldn't cause
any problems; I don't know of any cases where drivers call
usb_lock_device_for_reset() during probe.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 09:59:52 -08:00
Sam Ravnborg 3f4528d6e9 sparc64: Fix unsigned long long warnings in drivers.
Fix warnings caused by the unsigned long long usage in sparc
specific drivers.

The drivers were considered sparc specific more or less from the
filename alone.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-06 13:20:38 -08:00
Linus Torvalds ab70537c32 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: struct device - replace bus_id with dev_name()
  lguest: move the initial guest page table creation code to the host
  kvm-s390: implement config_changed for virtio on s390
  virtio_console: support console resizing
  virtio: add PCI device release() function
  virtio_blk: fix type warning
  virtio: block: dynamic maximum segments
  virtio: set max_segment_size and max_sectors to infinite.
  virtio: avoid implicit use of Linux page size in balloon interface
  virtio: hand virtio ring alignment as argument to vring_new_virtqueue
  virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize
  virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize
  virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci.
  virtio: rename 'pagesize' arg to vring_init/vring_size
  virtio: Don't use PAGE_SIZE in virtio_pci.c
  virtio: struct device - replace bus_id with dev_name(), dev_set_name()
  virtio-pci queue allocation not page-aligned
2008-12-30 17:37:25 -08:00
Randy Dunlap b194aee956 virtio_blk: fix type warning
Fix parameter type warning:

linux-next-20081126/drivers/block/virtio_blk.c:307: warning: large integer implicitly truncated to unsigned type

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-12-30 09:26:06 +10:30
Rusty Russell 0864b79a15 virtio: block: dynamic maximum segments
Enhance the driver to handle whatever maximum segment number the host
tells us to handle.  Do to this, we need to allocate the scatterlist
dynamically.

We set max_phys_segments and max_hw_segments to the same value (1 if
the host doesn't tell us, since that's safest and all known hosts do
tell us).

Note that kmalloc'ing the structure for large sg_elems might be
problematic: the fix for this is sg_table, but that requires more
work.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-12-30 09:26:05 +10:30
Rusty Russell 4b7f7e2049 virtio: set max_segment_size and max_sectors to infinite.
Setting max_segment_size allows more than 64k per sg element, unless
the host specified a limit.  Setting max_sectors indicates that our
max_hw_segments is the only limit.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-12-30 09:26:05 +10:30
Stephen M. Cameron a0ea862291 cciss: simplify parameters to deregister_disk function
Simplify parameters to deregister_disk function.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:52 +01:00
Milan Broz 8ae30b8958 loop: Do not call loop_unplug for not configured loop device.
In loop_unplug() function is expected that mapping is set
and lo->lo_backing_file is not NULL.

Unfortunately loop_set_fd() set the request queue unplug function,
but loop_clr_fd() doesn't clear that.

Loop device allows open of non-configured loop in some situations.
If the unplug on request queue is called, loop module oopses because
of missing lo_backing_file.

Simple reproducer:
	losetup /dev/loop0 /xxx
	losetup -d /dev/loop0
	dmsetup create x --table "0 1 linear /dev/loop0 0"

 EIP is at loop_unplug+0x1d/0x3b
 ...
  Call Trace:
   blk_unplug+0x57/0x5e
   dm_table_unplug_all+0x34/0x77 [dm_mod]
   destroy_inode+0x27/0x38
   generic_delete_inode+0xd5/0xd9
   iput+0x4b/0x4e
   dm_resume+0xca/0xfe [dm_mod]
   dev_suspend+0x143/0x165 [dm_mod]
   dm_ctl_ioctl+0x18e/0x1cf [dm_mod]
   dev_suspend+0x0/0x165 [dm_mod]
   dm_ctl_ioctl+0x0/0x1cf [dm_mod]
   vfs_ioctl+0x22/0x69
   do_vfs_ioctl+0x39d/0x3c7
   trace_hardirqs_on+0xb/0xd
   remove_vma+0x50/0x56
   do_munmap+0x21c/0x237
   sys_ioctl+0x2c/0x45
   sysenter_do_call+0x12/0x31

Several reports here
http://www.kerneloops.org/search.php?search=loop_unplug

Fix it by simply clear unplug function together with
removing of backing file.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:52 +01:00
Milan Broz 14f2793958 loop: Flush possible running bios when loop device is released.
When there are still queued bios and reference count
drops to zero, loop device must flush all queued bios.

Otherwise it can lead to situation that caller
closes the device, but some bios are still running
and endio() function call later OOpses when uses
unallocated mempool.

This happens for example when running dm-crypt over loop,
here is typical oops backtrace:

 Oops: 0000 [#1] PREEMPT SMP
 EIP is at mempool_free+0x12/0x6b
...
 crypt_dec_pending+0x50/0x54 [dm_crypt]
 crypt_endio+0x9f/0xa7 [dm_crypt]
 crypt_endio+0x0/0xa7 [dm_crypt]
 bio_endio+0x2b/0x2e
 loop_thread+0x37a/0x3b1
 do_lo_send_aops+0x0/0x165
 autoremove_wake_function+0x0/0x33
 loop_thread+0x0/0x3b1
 kthread+0x3b/0x61
 kthread+0x0/0x61
 kernel_thread_helper+0x7/0x10

(But crash is reproducible with different dm targets
running over loop device too.)

Patch fixes it by flushing the bios in release call,
reusing the flush mechanism for switching backing store.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:52 +01:00
Jens Axboe 31dcfab0ae nbd: tell the block layer that it is not a rotational device
Then we can get rid of that manual elevator type fiddling.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:50 +01:00
Jens Axboe b374d18a4b block: get rid of elevator_t typedef
Just use struct elevator_queue everywhere instead.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:50 +01:00
Jens Axboe 8a3173de4a cciss: switch to using hlist for command list management
This both cleans up the code and also helps detect the spurious case
of a command attempted being removed from a queue it doesn't belong
to.

Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:43 +01:00
Fernando Luis Vázquez Cao 66d352e1e4 xen-blkfront: set queue paravirt flag
Xen's blkfront sets noop as the default I/O scheduler at initialization
time to avoid elevator overheads such as idling, but with the advent of
basic disk profiling capabilities this is not necessary anymore. We
should just tell the block layer that we are a paravirt front-end driver
and the elevator will automatically make the necessary adjustments.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:41 +01:00
Fernando Luis Vázquez Cao 7d116b626b virtio_blk: set queue paravirt flag
As a paravirt front-end driver, virtio_blk is not a rotational device so
we want do avoid idling in AS/CFQ. Tell the block layer about this.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:41 +01:00
Linus Torvalds 0191b625ca Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
  net: Allow dependancies of FDDI & Tokenring to be modular.
  igb: Fix build warning when DCA is disabled.
  net: Fix warning fallout from recent NAPI interface changes.
  gro: Fix potential use after free
  sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
  sfc: When disabling the NIC, close the device rather than unregistering it
  sfc: SFT9001: Add cable diagnostics
  sfc: Add support for multiple PHY self-tests
  sfc: Merge top-level functions for self-tests
  sfc: Clean up PHY mode management in loopback self-test
  sfc: Fix unreliable link detection in some loopback modes
  sfc: Generate unique names for per-NIC workqueues
  802.3ad: use standard ethhdr instead of ad_header
  802.3ad: generalize out mac address initializer
  802.3ad: initialize ports LACPDU from const initializer
  802.3ad: remove typedef around ad_system
  802.3ad: turn ports is_individual into a bool
  802.3ad: turn ports is_enabled into a bool
  802.3ad: make ntt bool
  ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
  ...

Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
2008-12-28 12:49:40 -08:00
James Morris cbacc2c7f0 Merge branch 'next' into for-linus 2008-12-25 11:40:09 +11:00
Stephen M. Cameron d8a0be6ab7 cciss: fix problem that deleting multiple logical drives could cause a panic
Fix problem that deleting multiple logical drives could cause a panic.

It fixes a panic which can be easily reproduced in the following way: Just
create several "arrays," each with multiple logical drives via hpacucli,
then delete the first array, and it will blow up in deregister_disk(), in
the call to get_host() when it tries to dig the hba pointer out of a NULL
queue pointer.

The problem has been present since my code to make rebuild_lun_table
behave better went in.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-19 08:14:07 +01:00
David S. Miller eb14f01959 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/e1000e/ich8lan.c
2008-12-15 20:03:50 -08:00
Kay Sievers cba767175b pktcdvd: remove broken dev_t export of class devices
The pktcdvd created class devices only export some sysfs files,
but have no char dev_t registered in the driver.

At class device creation time they copy the dev_t value of the
block device to the char device, wich will register a new char
device in the driver core and userspace, with a conflicting dev_t
value.

In many cases the class devices dev_t just points to a random
USB device. This fixes the sysfs "duplicate entry" errors.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Peter Osterlund <petero2@telia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-10 10:03:32 -08:00
Al Viro 2cbed8906f [PATCH] fix bogus argument of blkdev_put() in pktcdvd
final close of ->bdev should match the initial open, i.e.
get FMODE_READ | FMODE_NDELAY; FMODE_READ|FMODE_WRITE has
been a braino.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-04 04:22:59 -05:00
James Morris ec98ce480a Merge branch 'master' into next
Conflicts:
	fs/nfsd/nfs4recover.c

Manually fixed above to use new creds API functions, e.g.
nfs4_save_creds().

Signed-off-by: James Morris <jmorris@namei.org>
2008-12-04 17:16:36 +11:00
David S. Miller aa2ba5f108 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ixgbe/ixgbe_main.c
	drivers/net/smc91x.c
2008-12-02 19:50:27 -08:00
Linus Torvalds 03cfdb86ac Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: Fix system calls on Cell entered with XER.SO=1
  powerpc/cell: Fix GDB watchpoints, again
  powerpc/mpic: Don't reset affinity for secondary MPIC on boot
  powerpc/cell/axon-msi: Retry on missing interrupt
  powerpc: Fix boot freeze on machine with empty memory node
  powerpc: Fix IRQ assignment for some PCIe devices
  powerpc/spufs: Fix spinning in spufs_ps_fault on signal
  powerpc/mpc832x_rdb: fix swapped ethernet ids
  powerpc: Use generic PHY driver for Marvell 88E1111 PHY on GE Fanuc SBC610
  powerpc/85xx: L2 cache size wrong in 8572DS dts
  powerpc/virtex: Update defconfigs
  powerpc/52xx: update defconfigs
  xsysace: Fix driver to use resource_size_t instead of unsigned long
  powerpc/virtex: fix various format/casting printk mismatches
  powerpc/mpc5200: fix bestcomm Kconfig dependencies
  powerpc/44x: Fix 460EX/460GT machine check handling
  powerpc/40x: Limit allocable DRAM during early mapping
2008-11-30 16:44:18 -08:00
Harvey Harrison 411c41eea5 aoe: remove private mac address format function
Add %pm to omit the colons when printing a mac address.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:40:37 -08:00
Paul Mackerras 11bac8a026 Merge branch 'merge' of git://git.secretlab.ca/git/linux-2.6-mpc52xx into merge 2008-11-24 11:53:44 +11:00
Randy Dunlap 9f92f47197 cciss: fix DEBUG printk formats
Fix printk format warnings when CCISS_DEBUG is defined.

drivers/block/cciss.c:2856: warning: format '%d' expects type 'int', but argument 2 has type 'long unsigned int'
drivers/block/cciss.c:3205: warning: format '%x' expects type 'unsigned int', but argument 2 has type 'long unsigned int'
drivers/block/cciss.c:3236: warning: format '%x' expects type 'unsigned int', but argument 2 has type '__u64'
drivers/block/cciss.c:3246: warning: format '%x' expects type 'unsigned int', but argument 2 has type '__u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-19 18:50:00 -08:00
Zhaolei 68aee07f9b Release old elevator on change elevator
We should release old elevator when change to use a new one.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18 15:08:56 +01:00
James Morris f3a5c54701 Merge branch 'master' into next
Conflicts:
	fs/cifs/misc.c

Merge to resolve above, per the patch below.

Signed-off-by: James Morris <jmorris@namei.org>

diff --cc fs/cifs/misc.c
index ec36410,addd1dc..0000000
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@@ -347,13 -338,13 +338,13 @@@ header_assemble(struct smb_hdr *buffer
  		/*  BB Add support for establishing new tCon and SMB Session  */
  		/*      with userid/password pairs found on the smb session   */
  		/*	for other target tcp/ip addresses 		BB    */
 -				if (current->fsuid != treeCon->ses->linux_uid) {
 +				if (current_fsuid() != treeCon->ses->linux_uid) {
  					cFYI(1, ("Multiuser mode and UID "
  						 "did not match tcon uid"));
- 					read_lock(&GlobalSMBSeslock);
- 					list_for_each(temp_item, &GlobalSMBSessionList) {
- 						ses = list_entry(temp_item, struct cifsSesInfo, cifsSessionList);
+ 					read_lock(&cifs_tcp_ses_lock);
+ 					list_for_each(temp_item, &treeCon->ses->server->smb_ses_list) {
+ 						ses = list_entry(temp_item, struct cifsSesInfo, smb_ses_list);
 -						if (ses->linux_uid == current->fsuid) {
 +						if (ses->linux_uid == current_fsuid()) {
  							if (ses->server == treeCon->ses->server) {
  								cFYI(1, ("found matching uid substitute right smb_uid"));
  								buffer->Uid = ses->Suid;
2008-11-18 18:52:37 +11:00
Linus Torvalds fab349cceb Merge branch 'doc-subdirs' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs
* 'doc-subdirs' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs:
  Create/use more directory structure in the Documentation/ tree.
2008-11-15 11:51:03 -08:00
Randy Dunlap 31c00fc15e Create/use more directory structure in the Documentation/ tree.
Create Documentation/blockdev/ sub-directory and populate it.
Populate the Documentation/serial/ sub-directory.
Move MSI-HOWTO.txt to Documentation/PCI/.
Move ioctl-number.txt to Documentation/ioctl/.
Update all relevant 00-INDEX files.
Update all relevant Kconfig files and source files.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
2008-11-14 17:28:53 +00:00
Yuri Tikhonov c14464bf79 xsysace: Fix driver to use resource_size_t instead of unsigned long
This patch is a bug fix to the SystemACE driver to use resource_size_t
for physical address instead of unsigned long.  This makes the driver
work correctly on 32 bit systems with 64-bit resources (e.g. PowerPC 440).

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-11-14 10:21:57 -07:00
Grant Likely a108096878 powerpc/virtex: fix various format/casting printk mismatches
Various printk format string in code used by the Xilinx Virtex platform
are not 32-bit/64-bit safe.  Add correct casting to fix the bugs.

Reported-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2008-11-14 09:59:48 -07:00
James Morris 2b82892565 Merge branch 'master' into next
Conflicts:
	security/keys/internal.h
	security/keys/process_keys.c
	security/keys/request_key.c

Fixed conflicts above by using the non 'tsk' versions.

Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 11:29:12 +11:00
David Howells b0fafa816e CRED: Wrap task credential accesses in the block loopback driver
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:41 +11:00
Pete Zaitcev d73b7aff28 ub: stub pre_reset and post_reset to fix oops
Due to recent changes to usb_reset_device, the following hang occurs:

events/0      D 0000000000000000     0     6      2
 ffff880037477cc0 0000000000000046 ffff880037477c50 ffffffff80237434
 ffffffff80574c80 00000001000a015c 0000000000000286 ffff8800374757d0
 ffff88002a31c860 ffff880037475a00 0000000036779140 ffff880037475a00
Call Trace:
 [<ffffffff80237434>] try_to_del_timer_sync+0x52/0x5b
 [<ffffffff8026f86c>] dma_pool_free+0x1a7/0x1ec
 [<ffffffffa02a928a>] ub_disconnect+0x8e/0x1ad [ub]
 [<ffffffff802407c9>] autoremove_wake_function+0x0/0x2e
 [<ffffffff80378959>] usb_unbind_interface+0x5c/0xb7
 [<ffffffff8036ab70>] __device_release_driver+0x95/0xbd
 [<ffffffff8036ac70>] device_release_driver+0x21/0x2d
 [<ffffffff803789f8>] usb_driver_release_interface+0x44/0x83
 [<ffffffff80378ab9>] usb_forced_unbind_intf+0x17/0x1d
 [<ffffffff80371ba4>] usb_reset_device+0x7d/0x114
 [<ffffffffa02aaffd>] ub_reset_task+0x0/0x293 [ub]
 [<ffffffffa02ab1c1>] ub_reset_task+0x1c4/0x293 [ub]
 [<ffffffff8033dd1e>] flush_to_ldisc+0x0/0x1cd
 [<ffffffffa02aaffd>] ub_reset_task+0x0/0x293 [ub]
 [<ffffffff8023d302>] run_workqueue+0x87/0x114
 [<ffffffff8023d467>] worker_thread+0xd8/0xe7
 [<ffffffff802407c9>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8023d38f>] worker_thread+0x0/0xe7
 [<ffffffff802404c1>] kthread+0x47/0x73
 [<ffffffff8022c8dd>] schedule_tail+0x27/0x60
 [<ffffffff8020c249>] child_rip+0xa/0x11
 [<ffffffff8024047a>] kthread+0x0/0x73
 [<ffffffff8020c23f>] child_rip+0x0/0x11

This is because usb_reset_device now unbinds, and that calls disconnect,
which in case of ub waits until the reset completes... which deadlocks.
Worse, this deadlocks keventd and this takes whole box down.

I'm going to fix this properly later, but let's unbreak the driver
quickly for non-composite devices at least.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-11-13 14:45:04 -08:00
Mike Miller 22bece00dc cciss: fix regression firmware not displayed in procfs
This regression was introduced by commit
6ae5ce8e8d ("cciss: remove redundant code").

This patch fixes a regression where the controller firmware version is not
displayed in procfs.  The previous patch would be called anytime something
changed.  This will get called only once for each controller.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>		[2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:18 -08:00
Mike Miller 404443081c cciss: fix sysfs broken symlink regression
Regression introduced by commit 6ae5ce8e8d
("cciss: remove redundant code").

This patch fixes a broken symlink in sysfs that was introduced by the
above commit.  We broke it in 2.6.27-rc on or about 20080804.  Some
installers are broken if this symlink does not exist and they may not
detect the logical drives configured on the controller.  It does not
require being backported into 2.6.26.x or earlier kernels.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>		[2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:17 -08:00
Andrey Borzenkov 2197d18ded cpqarry: fix return value of cpqarray_init()
As reported by Dick Gevers on Compaq ProLiant:

Oct 13 18:06:51 dvgcpl kernel: Compaq SMART2 Driver (v 2.6.0)
Oct 13 18:06:51 dvgcpl kernel: sys_init_module: 'cpqarray'->init
suspiciously returned 1, it should follow 0/-E convention
Oct 13 18:06:51 dvgcpl kernel: sys_init_module: loading module anyway...
Oct 13 18:06:51 dvgcpl kernel: Pid: 315, comm: modprobe Not tainted
2.6.27-desktop-0.rc8.2mnb #1
Oct 13 18:06:51 dvgcpl kernel:  [<c0380612>] ? printk+0x18/0x1e
Oct 13 18:06:51 dvgcpl kernel:  [<c0158f85>] sys_init_module+0x155/0x1c0
Oct 13 18:06:51 dvgcpl kernel:  [<c0103f06>] syscall_call+0x7/0xb
Oct 13 18:06:51 dvgcpl kernel:  =======================

Make it return 0 on success and -ENODEV if no array was found.

Reported-by: Dick Gevers <dvgevers@xs4all.nl>
Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:17 -08:00
Mike Miller 77ca7286d1 cciss: new hardware support
Add support for 2 new SAS/SATA controllers.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:16 -08:00
Nick Piggin 4e02ed4b4a fs: remove prepare_write/commit_write
Nothing uses prepare_write or commit_write. Remove them from the tree
completely.

[akpm@linux-foundation.org: schedule simple_prepare_write() for unexporting]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
Al Viro 572c489215 [PATCH] sanitize blkdev_get() and friends
* get rid of fake struct file/struct dentry in __blkdev_get()
* merge __blkdev_get() and do_open()
* get rid of flags argument of blkdev_get()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:49:06 -04:00
Al Viro 9a1c354276 [PATCH] pass fmode_t to blkdev_put()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:58 -04:00
Al Viro 511de73ff0 [PATCH] kill the unused bsize on the send side of /dev/loop
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:56 -04:00
Al Viro ab746cb938 [PATCH] switch z2ram
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:17 -04:00
Al Viro f3f68b3673 [PATCH] switch xyspace
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:15 -04:00
Al Viro a63c848b04 [PATCH] switch xen
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:13 -04:00
Al Viro 961846ca5a [PATCH] switch xd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:11 -04:00
Al Viro 4e10985298 [PATCH] switch virtio_blk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:09 -04:00
Al Viro f115a14ae4 [PATCH] switch viodasd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:07 -04:00
Al Viro 4099a96693 [PATCH] switch ub
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:05 -04:00
Al Viro b4d9a4425b [PATCH] switch swim3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:03 -04:00
Al Viro 5e5e007c25 [PATCH] switch pktdvdcd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:48:01 -04:00
Al Viro 8cfc7ca40c [PATCH] switch pf
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:59 -04:00
Al Viro b6a895307a [PATCH] switch pd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:57 -04:00
Al Viro c9acf903e0 [PATCH] switch pcd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:55 -04:00
Al Viro a8cdc308c0 [PATCH] switch nbd
NB: nbd_ioctl() appears to be racy; BKL is held, but doesn't really
help, AFAICS.  Left as-is for now, but it'll need fixing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:53 -04:00
Al Viro bb21488482 [PATCH] switch loop
ioctl doesn't need BKL here

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:51 -04:00
Al Viro a4af9b48cb [PATCH] switch floppy
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:49 -04:00
Al Viro 47844fadb5 [PATCH] switch cpqarray
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:47 -04:00
Al Viro ef7822c2fb [PATCH] switch cciss
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:46 -04:00
Al Viro 2b9ecd0333 [PATCH] switch brd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:44 -04:00
Al Viro 60ad234007 [PATCH] switch ataflop
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:42 -04:00
Al Viro 94562c1751 [PATCH] switch aoeblk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:40 -04:00
Al Viro 47225db519 [PATCH] switch amiflop
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:38 -04:00
Al Viro b564f027ad [PATCH] switch DAC960
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:36 -04:00
Al Viro d4430d62fa [PATCH] beginning of methods conversion
To keep the size of changesets sane we split the switch by drivers;
to keep the damn thing bisectable we do the following:
	1) rename the affected methods, add ones with correct
prototypes, make (few) callers handle both.  That's this changeset.
	2) for each driver convert to new methods.  *ALL* drivers
are converted in this series.
	3) kill the old (renamed) methods.

Note that it _is_ a flagday; all in-tree drivers are converted and by the
end of this series no trace of old methods remain.  The only reason why
we do that this way is to keep the damn thing bisectable and allow per-driver
debugging if anything goes wrong.

New methods:
	open(bdev, mode)
	release(disk, mode)
	ioctl(bdev, mode, cmd, arg)		/* Called without BKL */
	compat_ioctl(bdev, mode, cmd, arg)
	locked_ioctl(bdev, mode, cmd, arg)	/* Called with BKL, legacy */

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:32 -04:00
Al Viro 633a08b812 [PATCH] introduce __blkdev_driver_ioctl()
Analog of blkdev_driver_ioctl() with sane arguments.  For
now uses fake struct file, by the end of the series it won't
and blkdev_driver_ioctl() will become a wrapper around it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:26 -04:00
Al Viro a0eb62a0a4 [PATCH] switch pktcdvd to blkdev_driver_ioctl()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:24 -04:00
Al Viro bbc1cc9784 [PATCH] switch cdrom_{open,release,ioctl} to sane APIs
... convert to it in callers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:22 -04:00
Al Viro 74f3c8aff3 [PATCH] switch scsi_cmd_ioctl() to passing fmode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:14 -04:00
Al Viro 86d434dede [PATCH] eliminate use of ->f_flags in block methods
store needed information in f_mode

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:08 -04:00
Al Viro aeb5d72706 [PATCH] introduce fmode_t, do annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:47:06 -04:00
Parag Warudkar 01e8ef11bc x86: sysfs: kill owner field from attribute
Tejun's commit 7b595756ec made sysfs
attribute->owner unnecessary.  But the field was left in the structure to
ease the merge.  It's been over a year since that change and it is now
time to start killing attribute->owner along with its users - one arch at
a time!

This patch is attempt #1 to get rid of attribute->owner only for
CONFIG_X86_64 or CONFIG_X86_32 .  We will deal with other arches later on
as and when possible - avr32 will be the next since that is something I
can test.  Compile (make allyesconfig / make allmodconfig / custom config)
and boot tested.

akpm: the idea is that we put the declaration of sttribute.owner inside
`#ifndef CONFIG_X86'.  But that proved to be too ambitious for now because
new usages kept on turning up in subsystem trees.

[akpm: remove the ifdef for now]
Signed-off-by: Parag Warudkar <parag.lkml@gmail.com>
Cc: Greg KH <greg@kroah.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:42 -07:00
Pete Zaitcev 7dbcbe88b1 ub: remove sg_stat
Remove forgotten code related to sg_stat[].

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-17 14:40:52 -07:00
Greg Kroah-Hartman 1ff9f542e5 device create: block: convert device_create_drvdata to device_create
Now that device_create() has been audited, rename things back to the
original call to be sane.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16 09:24:41 -07:00
Benjamin Herrenschmidt 6dc6472581 Merge commit 'origin'
Manual fixup of conflicts on:

	arch/powerpc/include/asm/dcr-regs.h
	drivers/net/ibm_newemac/core.h
2008-10-15 11:31:54 +11:00
Adrian Bunk 29c8a24672 m68k: Remove the broken Hades support
This patch removes the Hades support that was marked as BROKEN 5 years ago.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-14 10:23:27 -07:00
Linus Torvalds 807f4f8cdd Merge branch 'x86-core-v2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
This merges in:

  x86/build, x86/microcode, x86/spinlocks, x86/memory-corruption-check,
  x86/early-printk, x86/xsave, x86/quirks, x86/setup, x86/signal,
  core/signal, x86/urgent, x86/xen

* 'x86-core-v2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (142 commits)
  x86: make processor type select depend on CONFIG_EMBEDDED
  x86: extend processor type select help text
  x86, amd-iommu: propagate PCI device enabling error
  warnings: fix arch/x86/kernel/io_apic_64.c
  warnings: fix arch/x86/kernel/early_printk.c
  x86, fpu: check __clear_user() return value
  x86: memory corruption check - cleanup
  x86: ioperm user_regset
  xen: do not reserve 2 pages of padding between hypervisor and fixmap.
  xen: use spin_lock_nest_lock when pinning a pagetable
  x86: xsave: set FP, SSE bits in the xsave header in the user sigcontext
  x86: xsave: fix error condition in save_i387_xstate()
  x86: SB450: deprioritize DMI quirks
  x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC
  x86: replace a magic number with a named constant in the VESA boot code
  x86 setup: remove IMAGE_OFFSET
  x86 setup: remove DEF_INITSEG and DEF_SETUPSEG
  Revert "x86: fix ghost EDD devices in /sys again"
  x86 setup: fix ghost entries under /sys/firmware/edd take 3
  x86: signal: remove indent in restore_sigcontext()
  ...
2008-10-12 12:05:14 -07:00
Linus Torvalds 0710483959 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (180 commits)
  leo: disable cursor when leaving graphics mode
  cg6: disable cursor when leaving graphics mode
  sparc32: sun4m interrupt mask cleanup
  drivers/rtc/Kconfig: don't build rtc-cmos.o on sparc32
  sparc: arch/sparc/kernel/pmc.c -- extra #include?
  sparc32: Add more extensive documentation of sun4m interrupts.
  sparc32: Kill irq_rcvreg from sun4m_irq.c
  sparc32: Delete master_l10_limit.
  sparc32: Use PROM device probing for sun4c timers.
  sparc32: Use PROM device probing for sun4c interrupt register.
  sparc32: Delete claim_ticker14().
  sparc32: Stop calling claim_ticker14() from sun4c_irq.c
  sparc32: Kill clear_profile_irq btfixup entry.
  sparc32: Call sun4m_clear_profile_irq() directly from sun4m_smp.c
  sparc32: Remove #if 0'd code from sun4c_irq.c
  sparc32: Remove some SMP ifdefs in sun4d_irq.c
  sparc32: Use PROM infrastructure for probing and mapping sun4d timers.
  sparc32: Use PROM device probing for sun4m irq registers.
  sparc32: Use PROM device probing for sun4m timer registers.
  sparc: Fix user_regset 'n' field values.
  ...
2008-10-12 11:40:55 -07:00
Ingo Molnar 365d46dc9b Merge branch 'linus' into x86/xen
Conflicts:
	arch/x86/kernel/cpu/common.c
	arch/x86/kernel/process_64.c
	arch/x86/xen/enlighten.c
2008-10-12 12:37:32 +02:00
Linus Torvalds 5c3c4d9b58 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (71 commits)
  ide: Remove ide_spin_wait_hwgroup() and use special requests instead
  ide: move IDE{FLOPPY,TAPE}_WAIT_CMD defines to <linux/ide.h>
  ide: add ide_do_test_unit_ready() helper
  ide: add ide_do_start_stop() helper
  ide: add ide_set_media_lock() helper
  ide-floppy: move floppy ioctls handling to ide-floppy_ioctl.c
  ide-floppy: ->{srfp,wp} -> IDE_AFLAG_{SRFP,WP}
  ide: add ide_queue_pc_tail() helper
  ide: add ide_queue_pc_head() helper
  ide: add ide_init_pc() helper
  ide-tape: add ide_tape_set_media_lock() helper
  ide-floppy: add ide_floppy_set_media_lock() helper
  ide: add ide_io_buffers() helper
  ide-scsi: cleanup ide_scsi_io_buffers()
  ide-floppy: remove MODE_SENSE_* defines
  ide-{floppy,tape}: remove packet command stack
  ide-{floppy,tape}: remove request stack
  ide-generic: handle probing of legacy io-ports v5
  ide-floppy: use scatterlists for pio transfers
  ide-tape: remove idetape_init_rq()
  ...
2008-10-11 13:22:33 -07:00
David S. Miller 56c5d900db Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	sound/core/memalloc.c
2008-10-11 12:39:35 -07:00
Linus Torvalds 4dd9ec4946 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1075 commits)
  myri10ge: update driver version number to 1.4.3-1.369
  r8169: add shutdown handler
  r8169: preliminary 8168d support
  r8169: support additional 8168cp chipset
  r8169: change default behavior for mildly identified 8168c chipsets
  r8169: add a new 8168cp flavor
  r8169: add a new 8168c flavor (bis)
  r8169: add a new 8168c flavor
  r8169: sync existing 8168 device hardware start sequences with vendor driver
  r8169: 8168b Tx performance tweak
  r8169: make room for more specific 8168 hardware start procedure
  r8169: shuffle some registers handling around (8168 operation only)
  r8169: new phy init parameters for the 8168b
  r8169: update phy init parameters
  r8169: wake up the PHY of the 8168
  af_key: fix SADB_X_SPDDELETE response
  ath9k: Fix return code when ath9k_hw_setpower() fails on reset
  ath9k: remove nasty FAIL macro from ath9k_hw_reset()
  gre: minor cleanups in netlink interface
  gre: fix copy and paste error
  ...
2008-10-11 09:33:18 -07:00
Bartlomiej Zolnierkiewicz f26b3d7595 hd: WIN_* -> ATA_CMD_*
* Use ATA_CMD_* defines instead of WIN_* ones.

* Include <linux/ata.h> directly instead of through <linux/hdreg.h>.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-10-10 22:39:21 +02:00
Denis ChengRq 6feef531f5 block: mark bio_split_pool static
Since all bio_split calls refer the same single bio_split_pool, the bio_split
function can use bio_split_pool directly instead of the mempool_t parameter;

then the mempool_t parameter can be removed from bio_split param list, and
bio_split_pool is only referred in fs/bio.c file, can be marked static.

Signed-off-by: Denis ChengRq <crquan@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:57:05 +02:00
Kiyoshi Ueda 8316982ac0 virtio_blk: change to use __blk_end_request()
This patch converts virtio_blk to use __blk_end_request() directly
so that end_{queued|dequeued}_request() can be removed.
Related 'uptodate' argument is converted to 'error'.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:20 +02:00
Keith Wansbrough 9e49184c82 floppy: support arbitrary first-sector numbers
The current floppy_struct allows floppies to number sectors starting
from 0 or 1.  This patch allows arbitrary first-sector numbers - for
example, 0xC1 for Amstrad CPC disks.

This extends the existing 1-bit field (FD_ZEROBASED, bit 2 of stretch)
to 8 bits (FD_SECTMASK, bits 2 to 9).

Currently 0x00 denotes a first sector number of 1, and 0x01 denotes a
first sector number of 0.  We extend this by interpreting FD_SECTMASK
as the first sector number with the LSB flipped.

Signed-off-by: Keith Wansbrough <keith@lochan.org>
Cc: Alain Knaff <alain@linux.lu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Karel Zak <kzak@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:19 +02:00
Julia Lawall 061837bc86 drivers/block: Use DIV_ROUND_UP
The kernel.h macro DIV_ROUND_UP performs the computation (((n) + (d) - 1) /
(d)) but is perhaps more readable.

An extract of the semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@haskernel@
@@

#include <linux/kernel.h>

@depends on haskernel@
expression n,d;
@@

(
- (n + d - 1) / d
+ DIV_ROUND_UP(n,d)
|
- (n + (d - 1)) / d
+ DIV_ROUND_UP(n,d)
)

@depends on haskernel@
expression n,d;
@@

- DIV_ROUND_UP((n),d)
+ DIV_ROUND_UP(n,d)

@depends on haskernel@
expression n,d;
@@

- DIV_ROUND_UP(n,(d))
+ DIV_ROUND_UP(n,d)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:19 +02:00
scameron@beardog.cca.cpqcorp.net 905bd78f21 cciss: Fix cciss SCSI rescan code to better notice device changes
Fix cciss SCSI rescan code to better notice device changes.
If you hot-unplug a tape drive, then hot-plug a different
tape drive into the same slot in a storage enclosure,
the cciss driver wouldn't notice anything had changed, as
it was only looking at the LUN address and device type.
Now it looks at the inquiry page 0x83 device identifier,
and vendor and model strings as well.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:18 +02:00
Chris Lalancette 9246b5f06d block: Expand Xen blkfront for > 16 xvd
Until recently, the maximum number of xvd block devices you could attach
to a Xen domU was 16. This limitation turned out to be problematic for
some users, so it was expanded to handle a much larger number of disks.
However, this requires a couple of changes in the way that blkfront
scans for disks. This functionality is already present in the Xen
linux-2.6.18-xen.hg tree; the attached patch adds this functionality to
the mainline xen-blkfront implementation. I successfully tested it on a
2.6.25 tree, and build tested it on 2.6.27-rc3.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:18 +02:00
Tejun Heo 074a7aca7a block: move stats from disk to part0
Move stats related fields - stamp, in_flight, dkstats - from disk to
part0 and unify stat handling such that...

* part_stat_*() now updates part0 together if the specified partition
  is not part0.  ie. part_stat_*() are now essentially all_stat_*().

* {disk|all}_stat_*() are gone.

* part_round_stats() is updated similary.  It handles part0 stats
  automatically and disk_round_stats() is killed.

* part_{inc|dec}_in_fligh() is implemented which automatically updates
  part0 stats for parts other than part0.

* disk_map_sector_rcu() is updated to return part0 if no part matches.
  Combined with the above changes, this makes NULL special case
  handling in callers unnecessary.

* Separate stats show code paths for disk are collapsed into part
  stats show code paths.

* Rename disk_stat_lock/unlock() to part_stat_lock/unlock()

While at it, reposition stat handling macros a bit and add missing
parentheses around macro parameters.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:08 +02:00
Tejun Heo 80795aefb7 block: move capacity from disk to part0
Move disk->capacity to part0->nr_sects and convert all users who
directly accessed the field to use {get|set}_capacity().  This is done
early to allow the __dev field to be moved.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo ed9e198234 block: implement and use {disk|part}_to_dev()
Implement {disk|part}_to_dev() and use them to access generic device
instead of directly dereferencing {disk|part}->dev.  To make sure no
user is left behind, rename generic devices fields to __dev.

This is in preparation of unifying partition 0 handling with other
partitions.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:07 +02:00
Tejun Heo c995905916 block: fix diskstats access
There are two variants of stat functions - ones prefixed with double
underbars which don't care about preemption and ones without which
disable preemption before manipulating per-cpu counters.  It's unclear
whether the underbarred ones assume that preemtion is disabled on
entry as some callers don't do that.

This patch unifies diskstats access by implementing disk_stat_lock()
and disk_stat_unlock() which take care of both RCU (for partition
access) and preemption (for per-cpu counter access).  diskstats access
should always be enclosed between the two functions.  As such, there's
no need for the versions which disables preemption.  They're removed
and double underbars ones are renamed to drop the underbars.  As an
extra argument is added, there's no danger of using the old version
unconverted.

disk_stat_lock() uses get_cpu() and returns the cpu index and all
diskstat functions which access per-cpu counters now has @cpu
argument to help RT.

This change adds RCU or preemption operations at some places but also
collapses several preemption ops into one at others.  Overall, the
performance difference should be negligible as all involved ops are
very lightweight per-cpu ones.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:06 +02:00
Tejun Heo e71bf0d0ee block: fix disk->part[] dereferencing race
disk->part[] is protected by its matching bdev's lock.  However,
non-critical accesses like collecting stats and printing out sysfs and
proc information used to be performed without any locking.  As
partitions can come and go dynamically, partitions can go away
underneath those non-critical accesses.  As some of those accesses are
writes, this theoretically can lead to silent corruption.

This patch fixes the race by using RCU for the partition array and dev
reference counter to hold partitions.

* Rename disk->part[] to disk->__part[] to make sure no one outside
  genhd layer proper accesses it directly.

* Use RCU for disk->__part[] dereferencing.

* Implement disk_{get|put}_part() which can be used to get and put
  partitions from gendisk respectively.

* Iterators are implemented to help iterate through all partitions
  safely.

* Functions which require RCU readlock are marked with _rcu suffix.

* Use disk_put_part() in __blkdev_put() instead of directly putting
  the contained kobject.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:06 +02:00
Tejun Heo f331c0296f block: don't depend on consecutive minor space
* Implement disk_devt() and part_devt() and use them to directly
  access devt instead of computing it from ->major and ->first_minor.

  Note that all references to ->major and ->first_minor outside of
  block layer is used to determine devt of the disk (the part0) and as
  ->major and ->first_minor will continue to represent devt for the
  disk, converting these users aren't strictly necessary.  However,
  convert them for consistency.

* Implement disk_max_parts() to avoid directly deferencing
  genhd->minors.

* Update bdget_disk() such that it doesn't assume consecutive minor
  space.

* Move devt computation from register_disk() to add_disk() and make it
  the only one (all other usages use the initially determined value).

These changes clean up the code and will help disk->part dereference
fix and extended block device numbers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:05 +02:00
Tejun Heo 310a2c1012 block: misc updates
This patch makes the following misc updates in preparation for
disk->part dereference fix and extended block devt support.

* implment part_to_disk()

* fix comment about gendisk->part indexing

* rename get_part() to disk_map_sector()

* don't use n which is always zero while printing disk information in
  diskstats_show()

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:04 +02:00
Fernando Luis Vázquez Cao 766ca4428d virtio_blk: use a wrapper function to access io context information of IO requests
struct request has an ioprio member but it is never updated because
currently bios do not hold io context information. The implication of
this is that virtio_blk ends up passing useless information to the
backend driver.

That said, some IO schedulers such as CFQ do store io context
information in struct request, but use private members for that, which
means that that information cannot be directly accessed in a IO
scheduler-independent way.

This patch adds a function to obtain the ioprio of a request. We should
avoid accessing ioprio directly and use this function instead, so that
its users do not have to care about future changes in block layer
structures or what the currently active IO controller is.

This patch does not introduce any functional changes but paves the way
for future clean-ups and enhancements.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:02 +02:00
David Woodhouse 1a8e2bddd5 Kill REQ_TYPE_FLUSH
It was only used by ps3disk, and it should probably have been
REQ_TYPE_LINUX_BLOCK + REQ_LB_OP_FLUSH.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-09 08:56:02 +02:00
Johann Felix Soden 0bb08107ed powerpc/iseries: Remove unused variable in viodasd.c
The variable statindex in send_request is never read, so remove it.

Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-10-07 14:26:19 +11:00
David S. Miller d87798450a aoe: Fix OOPS after SKB queue changes.
Reported by Thomas Graf.

If we don't unlink the SKB from the queue when we send it
out in aoenet_xmit(), dev_hard_start_xmit() will see skb->next
as non-NULL and interpret this to mean the SKB is part of a
GSO segment list.

Add __skb_unlink() call to fix that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-23 20:47:22 -07:00
Kumar Gala 68e1ee62f0 powerpc: convert CONFIG_PPC_MERGE to CONFIG_PPC for legacy io checks
Now that arch/ppc is dead CONFIG_PPC_MERGE is always defined for all
powerpc platforms and we want to get rid of CONFIG_PPC_MERGE use
CONFIG_PPC instead.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-09-23 10:41:28 -05:00
David S. Miller e9bb8fb0b6 aoe: Use SKB interfaces for list management instead of home-grown stuff.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-21 22:36:49 -07:00
David S. Miller 2e57572a50 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Conflicts:

	arch/sparc64/kernel/pci_psycho.c
2008-09-16 14:11:43 -07:00
Ingo Molnar 3ce9bcb583 Merge branch 'core/xen' into x86/xen 2008-09-10 14:05:45 +02:00
David S. Miller 3d452e55ef sparc64: Apply const or __initdata to vio_device_id[]
This mirrors the of_device_id[] changes done in
fd098316ef ("sparc: Annotate
of_device_id arrays with const or __initdata.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-01 01:48:52 -07:00
Linus Torvalds 8560c650f3 Revert "pktcdvd: push BKL down into driver"
This reverts commit 5b6155ee70, because
the block device ioctl's really aren't ready for it.

In particular, the "struct file *" and the "struct inode *" arguments do
not necessarily match, which means that the unlocked version of the
ioctl (that only gets a "struct file *") isn't actually able to handle
the cases it needs to handle.

This fixes bugzilla

	http://bugzilla.kernel.org/show_bug.cgi?id=11401

Reported-and-bisected-by: Laurent Riffard <laurent.riffard@free.fr>
Acked-by: Peter Osterlund <petero2@telia.com>
Cc: Alan Cox <alan@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-27 13:42:00 -07:00
Ingo Molnar e4f807c2b4 Merge branch 'linus' into x86/xen
Conflicts:
	arch/x86/kernel/paravirt.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:54:07 +02:00
Akinobu Mita c82f296601 brd: fix name argument of unregister_blkdev()
The name of brd block device is "ramdisk", it's not "brd".
(The block device is registered by register_blkdev(RAMDISK_MAJOR, "ramdisk")
So it should be unregistered by unregister_blkdev(RAMDISK_MAJOR, "ramdisk")

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:30 -07:00
Sven Wegener f3944d61dd nbd: fix memory leak of nbd_dev array
We leak the memory allocated for the nbd_dev array at multiple places.
Fix them by either adding a kfree() or by rearranging code to return
before we allocate the memory.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-20 15:40:30 -07:00
Jeremy Fitzhardinge 6e833587e1 xen: clean up domain mode predicates
There are four operating modes Xen code may find itself running in:
 - native
 - hvm domain
 - pv dom0
 - pv domU

Clean up predicates for testing for these states to make them more consistent.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:40:07 +02:00
Adrian Bunk 62aa0054da xen-blkfront.c: make blkif_ioctl() static
This patch makes the needlessly global blkif_ioctl() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:04 +02:00
Mike Miller ba198efb5e cciss: fix bug if scsi tape support is disabled
Bug fix. If SCSI tape support is turned off we get an implicit declaration
of cciss_unregister_scsi error in cciss_remove_one.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:04 +02:00
Mike Miller 935dc8d757 cciss: add support for multi lun tape devices
This patch adds support for multi-lun devices in a SAS environment. It's
required for the support of media changers.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:04 +02:00
Mike Miller f4a93bcda7 cciss: change the way we notify scsi midlayer of tape drives
This patch changes way we notify the scsi layer that something has changed
on the SCSI tape side of the driver. The user can now just tell the driver
to rescan a particular controller rather than having to know the SCSI nexus
to echo into the SCSI mid-layer.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:04 +02:00
Mike Miller eece695f8b cciss: fix negative logical drive count in procfs
This patch fixes a problem where the logical volume count may go negative.
In some instances if several logical are configured on a controller and all
of them are deleted using the online utilities the volume count in /proc may
go negative with no way get it correct again.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:03 +02:00
Mike Miller 6ae5ce8e8d cciss: remove redundant code
This patch removes redundant code where ever logical volumes are added or
removed. It adds 3 new functions that are called instead of having the same
code spread throughout the driver. It also removes the cciss_getgeometry
function.
The patch is fairly complex but we haven't figured out how to make it any
simpler and still do everything that needs to be done. Some of the
complexity comes from having to special case booting from cciss. Otherwise
the gendisk doesn't get added in time and the switchroot will fail.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:03 +02:00
Mike Miller a72da29b6c cciss: make rebuild_lun_table behave better
This patch makes the rebuild_lun_table smart enough to not rip a logical
volume out from under the OS. Without this fix if a customer is running
hpacucli to monitor their storage the driver will blindly remove and re-add
the disks whenever the utility calls the CCISS_REGNEWD ioctl. Unfortunately,
both hpacucli and ACUXE call the ioctl repeatedly. Customers have reported
IO coming to a standstill. Calling the ioctl is the problem, this patch is
the fix.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:03 +02:00
Nikanth Karthikesan f7108f91cd cciss: return -EFAULT if copy_from_user() fails
Return -EFAULT instead of -ENOMEM if copy_from_user() fails.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-08-06 12:30:03 +02:00
Hannes Reinecke 756fcab277 block/cciss.c: remove pointless curr_queue calculation
curr_queue is a local variable in a for loop, and it's being initialized
at the start of each loop.  So any assignment at the end of the loop is
pointless.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-05 14:33:46 -07:00
Niels de Vos 61a2d07d3f Remove newline from the description of module parameters
Some module parameters with only one line have the '\n' at the end of the
description.  This is not needed nor wanted as after the description the
type (i.e.  int) is followed by a newline.

Some modules contain a multi-line description, these are not affected
by this patch.

Signed-off-by: Niels de Vos <niels.devos@wincor-nixdorf.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Ed L. Cashin <ecashin@coraid.com>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Roland Dreier <rolandd@cisco.com>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-01 12:46:41 -07:00
Matthias Kaehlcke 24879a8e3e aoe: convert emsgs_sema into a completion
ATA over Ethernet: The semaphore emsgs_sema is used for signalling an
event, convert it in a completion.

Signed-off-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:45 -07:00
Christian Borntraeger 066f4d82a6 virtio_blk: check for hardsector size from host
Currently virtio_blk assumes a 512 byte hard sector size. This can cause
trouble / performance issues if the backing has a different block size
(like a file on an ext3 file system formatted with 4k block size or a dasd).

Lets add a feature flag that tells the guest to use a different hard sector
size than 512 byte.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-25 12:06:05 +10:00
Greg Kroah-Hartman f79f060561 device create: block: convert device_create to device_create_drvdata
device_create() is race-prone, so use the race-free
device_create_drvdata() instead as device_create() is going away.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:41 -07:00
Geert Uytterhoeven e945b568e2 m68k: Return -ENODEV if no device is found
According to the tests in do_initcalls(), the proper error code in case no
device is found is -ENODEV, not -ENXIO or -EIO.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:38 -07:00
Adrian Bunk 72a3d651b2 hd.c: remove the #include <linux/mc146818rtc.h>
The code that needed this #include was removed one year ago.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: rmk@arm.linux.org.uk
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:48 +02:00
Adrian Bunk f327c1c33f update the BLK_DEV_HD help text
Many people will see this option the first time now that it is in
drivers/block/

Make it clear that virtually noone needs it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: rmk@arm.linux.org.uk
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:47 +02:00
Adrian Bunk 453ea3ed0b move ide/legacy/hd.c to drivers/block/
This patch moves hd.c to drivers/block/

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: rmk@arm.linux.org.uk
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:47 +02:00
Linus Torvalds 98339cbd36 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (80 commits)
  ide-floppy: fix unfortunate function naming
  ide-tape: unify idetape_create_read/write_cmd
  ide: add ide_pc_intr() helper
  ide-{floppy,scsi}: read Status Register before stopping DMA engine
  ide-scsi: add more debugging to idescsi_pc_intr()
  ide-scsi: use pc->callback
  ide-floppy: add more debugging to idefloppy_pc_intr()
  ide-tape: always log debug info in idetape_pc_intr() if debugging is enabled
  ide-tape: add ide_tape_io_buffers() helper
  ide-tape: factor out DSC handling from idetape_pc_intr()
  ide-{floppy,tape}: move checking of ->failed_pc to ->callback
  ide: add ide_issue_pc() helper
  ide: add PC_FLAG_DRQ_INTERRUPT pc flag
  ide-scsi: move idescsi_map_sg() call out from idescsi_issue_pc()
  ide: add ide_transfer_pc() helper
  ide-scsi: set drive->scsi flag for devices handled by the driver
  ide-{cd,floppy,tape}: remove checking for drive->scsi
  ide: add PC_FLAG_ZIP_DRIVE pc flag
  ide-tape: factor out waiting for good ireason from idetape_transfer_pc()
  ide-tape: set PC_FLAG_DMA_IN_PROGRESS flag in idetape_transfer_pc()
  ...
2008-07-15 11:15:36 -07:00
FUJITA Tomonori d79c5a670d block: convert pd_special_command to use blk_execute_rq
pd_special_command uses blk_put_request with struct request on the
stack. As a result, blk_put_request needs a hack to catch a NULL
request_queue.  This converts pd_special_command to use
blk_execute_rq.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:45 +02:00
Linus Torvalds d1794f2c5b Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6
* 'bkl-removal' of git://git.lwn.net/linux-2.6: (146 commits)
  IB/umad: BKL is not needed for ib_umad_open()
  IB/uverbs: BKL is not needed for ib_uverbs_open()
  bf561-coreb: BKL unneeded for open()
  Call fasync() functions without the BKL
  snd/PCM: fasync BKL pushdown
  ipmi: fasync BKL pushdown
  ecryptfs: fasync BKL pushdown
  Bluetooth VHCI: fasync BKL pushdown
  tty_io: fasync BKL pushdown
  tun: fasync BKL pushdown
  i2o: fasync BKL pushdown
  mpt: fasync BKL pushdown
  Remove BKL from remote_llseek v2
  Make FAT users happier by not deadlocking
  x86-mce: BKL pushdown
  vmwatchdog: BKL pushdown
  vmcp: BKL pushdown
  via-pmu: BKL pushdown
  uml-random: BKL pushdown
  uml-mmapper: BKL pushdown
  ...
2008-07-14 14:48:31 -07:00
Jonathan Corbet 2fceef397f Merge commit 'v2.6.26' into bkl-removal 2008-07-14 15:29:34 -06:00
Linus Torvalds dddec01eb8 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (37 commits)
  splice: fix generic_file_splice_read() race with page invalidation
  ramfs: enable splice write
  drivers/block/pktcdvd.c: avoid useless memset
  cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack)
  scsi: sr avoids useless buffer allocation
  block: blk_rq_map_kern uses the bounce buffers for stack buffers
  block: add blk_queue_update_dma_pad
  DAC960: push down BKL
  pktcdvd: push BKL down into driver
  paride: push ioctl down into driver
  block: use get_unaligned_* helpers
  block: extend queue_flag bitops
  block: request_module(): use format string
  Add bvec_merge_data to handle stacked devices and ->merge_bvec()
  block: integrity flags can't use bit ops on unsigned short
  cmdfilter: extend default read filter
  sg: fix odd style (extra parenthesis) introduced by cmd filter patch
  block: add bounce support to blk_rq_map_user_iov
  cfq-iosched: get rid of enable_idle being unused warning
  allow userspace to modify scsi command filter on per device basis
  ...
2008-07-14 13:15:14 -07:00
Mike Miller 491539982a cciss: read config to obtain max outstanding commands per controller
This patch changes the way we determine the maximum number of outstanding
commands for each controller.

Most Smart Array controllers can support up to 1024 commands, the notable
exceptions are the E200 and E200i.

The next generation of controllers which were just added support a mode of
operation called Zero Memory Raid (ZMR).  In this mode they only support
64 outstanding commands.  In Full Function Raid (FFR) mode they support
1024.

We have been setting the queue depth by arbitrarily assigning some value
for each controller.  We needed a better way to set the queue depth to
avoid lots of annoying "fifo full" messages.  So we made the driver a
little smarter.  We now read the config table and subtract 4 from the
returned value.  The -4 is to allow some room for ioctl calls which are
not tracked the same way as io commands are tracked.

Please consider this for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04 10:40:09 -07:00
Stephen M. Cameron 77b96bd7e5 cciss: fix regression that no device nodes are created if no logical drives are configured.
Fix regression in cciss driver that if no logical drives are configured,
no device nodes at all get created.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04 10:40:06 -07:00
Christophe Jaillet 7c0c0b5b19 drivers/block/pktcdvd.c: avoid useless memset
Avoid the 'memset(...,0, ...)' before calling 'init_cdrom_command' because
this function already does it.

Signed-off-by: Christophe Jaillet <jaillet.christophe@wanadoo.fr>
Acked-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-04 09:52:14 +02:00
Alan Cox 2610324fca DAC960: push down BKL
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Richard Knutsson <ricknu-0@student.ltu.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-04 09:52:13 +02:00
Alan Cox 5b6155ee70 pktcdvd: push BKL down into driver
Push the lock_kernel down into the driver and switch to unlocked_ioctl

[akpm@linux-foundation.org: build fix]
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-04 09:52:13 +02:00
Alan Cox be1fd70fea paride: push ioctl down into driver
Leaves us with lock_kernel for two methods.  Also remove a bogus printk
with no printk level and return -ENOTTY not -EINVAL for correctness.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

(Jens: added smp_lock.h include to pt.c, otherwise it wont compile because
 of missing {un}lock_kernel() definition)

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-04 09:51:21 +02:00
Harvey Harrison 823ed72e8f block: use get_unaligned_* helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-04 09:28:32 +02:00
Alasdair G Kergon cc371e66e3 Add bvec_merge_data to handle stacked devices and ->merge_bvec()
When devices are stacked, one device's merge_bvec_fn may need to perform
the mapping and then call one or more functions for its underlying devices.

The following bio fields are used:
  bio->bi_sector
  bio->bi_bdev
  bio->bi_size
  bio->bi_rw  using bio_data_dir()

This patch creates a new struct bvec_merge_data holding a copy of those
fields to avoid having to change them directly in the struct bio when
going down the stack only to have to change them back again on the way
back up.  (And then when the bio gets mapped for real, the whole
exercise gets repeated, but that's a problem for another day...)

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Milan Broz <mbroz@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:15 +02:00
Ian Campbell a144ff09bc xen: Avoid allocations causing swap activity on the resume path
Avoid allocations causing swap activity on the resume path by
preventing the allocations from doing IO and allowing them
to access the emergency pools.

These paths are used when a frontend device is trying to connect
to its backend driver over Xenbus.  These reconnections are triggered
on demand by IO, so by definition there is already IO underway,
and further IO would naturally deadlock.  On resume, this path
is triggered when the running system tries to continue using its
devices.  If it cannot then the resume will fail; to try to avoid this
we let it dip into the emergency pools.

[ linux-2.6.18-xen changesets e8b49cfbdac, fdb998e79aba ]

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:13 +02:00
Jan Beulich 5a60d0cd4f xen/blkfront: add __exit to module_exit() handlers
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:13 +02:00
Wim Colgate 04c0635058 xen/blkfront: Make sure that the device is fully ready before allowing release.
[ linux-2.6.18-xen changeset c1c57fea77e9 ]

Signed-off-by: Wim Colgate <wim@xensource.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:13 +02:00
Christian Limpach 440a01a7f4 xen/blkfront: Add the CDROM_GET_CAPABILITY ioctl to blkfront.
Return 0 instead of -EINVAL if the blkfront device is a cdrom,
i.e. had the VDISK_CDROM attribute.  This allows udev's cdrom_id
to correctly detect the device as a cdrom device.

[ Add blkif_ioctl, and CDROMMULTISESSION ]

[ linux-2.6.18-xen changeset d2bd9af846b5 ]

Signed-off-by: Christian Limpach <Christian.Limpach@xensource.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:13 +02:00
Ian Campbell 1c91fe1a0d xen/blkfront: Make sure we don't use bounce buffers, we don't need them.
[ linux-2.6.18-xen changeset 667228bf8fc5 ]

Signed-off-by: Ian Campbell <ian.campbell@xensource.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-03 13:21:12 +02:00
Jonathan Corbet ea2959a297 paride: cdev lock_kernel() pushdown
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-06-20 14:03:43 -06:00
Jonathan Corbet 579174a55f AoE: cdev lock_kernel() pushdown
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2008-06-20 14:03:43 -06:00
Mike Miller 24aac480e7 cciss: add new hardware support
Add support for the next generation of HP Smart Array SAS/SATA
controllers.  Shipping date is late Fall 2008.

Bump the driver version to 3.6.20 to reflect the new hardware support from
patch 1 of this set.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-12 18:05:40 -07:00
Nick Piggin efedf51c86 Add 'rd' alias to new brd ramdisk driver
Alias brd to rd in the hope of helping legacy users. Suggested by Jan.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-05 14:23:12 -07:00
Christian Borntraeger 3ef5360954 virtio_blk: allow read-only disks
Hello Rusty,

sometimes it is useful to share a disk (e.g. usr). To avoid file system
corruption, the disk should be mounted read-only in that case. This patch
adds a new feature flag, that allows the host to specify, if the disk should
be considered read-only.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-30 15:09:44 +10:00
Chris Lalancette ac9d463afb Fix crash in virtio_blk during modprobe ; rmmod ; modprobe
Fix a modprobe virtio_blk ; rmmod virtio_blk ; modprobe virtio_blk crash; this
was basically because we weren't doing "del_gendisk()" in the remove path.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (moved del_gendisk up)
2008-05-30 15:09:41 +10:00
Marcin Krol 53978d0a7a brd: don't show ramdisks in /proc/partitions
In 2.6.25, ramdisk devices show up in /proc/partitions, which is a
behaviour change from the old rd.c.  Add GENHD_FL_SUPPRESS_PARTITION_INFO,
which was present in rd.c.

All kernels prior to 2.6.25 weren't displaying ramdisks in
/proc/partitions.  Since there are many userspace tools using information
from /proc/partitions some of them may now behave incorrectly (I didn't
tested any though).  For example before 2.6.25 /proc/partitions was empty
if no block devices like hard disks and such were detected by kernel.  Now
all 16 ramdisks are always visible there.  Some software may rely on such
information (I mean, on empty /proc/partitions).

There was quite similar situation back in 2004, and ramdisks were excluded
back from displaying.  Thats why I called this a regression (maybe a bit
unfortunate).  See this patch for info:
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.3-rc2/2.6.3-rc2-mm1/broken-out/nbd-proc-partitions-fix.patch

I also think that someone somewhere (long time ago) excluded ramdisks from
/proc/partitions for good reasons.  It is possible that now such new
"feature" is harmless, but I think there are more chances that someone
will say "hey, /proc/partitions has changed, now my software doesn't work"
then "hey where did my new 2.6.25 feature go".  nbd devices are also
excluded, maybe for very same (unknown to me) reasons.

Signed-off-by: Marcin Krol <hawk@pld-linux.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-24 09:56:11 -07:00
Stephen Rothwell 8962cadbe7 [POWERPC] iSeries: Remove unused mail address
I don't use my IBM email address normally and people can find me in
CREDITS.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-23 16:45:04 +10:00
Geert Uytterhoeven fd5b462f0b m68k: Return -ENODEV if no device is found
According to the tests in do_initcalls(), the proper error code in case no
device is found is -ENODEV, not -ENXIO or -EIO.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-18 13:28:50 -07:00
Jens Axboe 28f13702f0 block: avoid duplicate calls to get_part() in disk stat code
get_part() is fairly expensive, as it O(N) loops over partitions
to find the right one. In lots of normal IO paths we end up looking
up the partition twice, to make matters even worse. Change the
stat add code to accept a passed in partition instead.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-05-07 10:15:46 +02:00
Linus Torvalds 1be1d6b7f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (32 commits)
  USB GADGET/PERIPHERAL: g_file_storage Bulk-Only Transport compliance, clear-feature ignore
  USB GADGET/PERIPHERAL: g_file_storage Bulk-Only Transport compliance
  usb_serial: some coding style fixes
  USB: Remove redundant dependencies on USB_ATM.
  USB: UHCI: disable remote wakeup when it's not needed
  USB: OHCI: work around bogus compiler warning
  USB: add Cypress c67x00 OTG controller HCD driver
  USB: add Cypress c67x00 OTG controller core driver
  USB: add Cypress c67x00 low level interface code
  USB: airprime: unlock mutex instead of trying to lock it again
  USB: storage: Update mailling list address
  USB: storage: UNUSUAL_DEVS() for PanDigital Picture frame.
  USB: Add the USB 2.0 extension descriptor.
  USB: add more FTDI device ids
  USB: fix cannot work usb storage when using ohci-sm501
  usb: gadget zero timer init fix
  usb: gadget zero style fixups (mostly whitespace)
  usb serial gadget: CDC ACM fixes
  usb: pxa27x_udc driver
  USB: INTOVA Pixtreme camera mass storage device
  ...
2008-05-02 11:03:08 -07:00
Pete Zaitcev 9029b174ba ub: Cosmetics
Fix a few comments and printk statements.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-05-02 10:25:52 -07:00
Pete Zaitcev 0da13c8c3d ub: Ignore bad residue
I hoped to continue to ignore this problem or use libusual, but these
days it's simpler to work around than to deal with it. Let's attempt to
use bad residue devices and hope that upper level integrity checks catch
any problems (e.g. please use sha1sum on your backups).

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-05-02 10:25:52 -07:00
Pete Zaitcev 82fe26ba7a ub: Tune retries
Make ub to fail faster in hopeless cases.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-05-02 10:25:52 -07:00
Pete Zaitcev 2c51ae70ed ub: Fix timeouts
The wodim says:
"close track/session scsi sendcmd: cmd timeout after 5.000 (480) s"
This happened because we ignored the supplied timeout and used 5s.

It's not completely correct to apply a timeout meant for the complete
command to any single URB, but we don't have many URBs per command, so
this is simple and works.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-05-02 10:25:52 -07:00
Ryan Harper 48e4043d45 virtio: add virtio disk geometry feature
Rather than faking up some geometry, allow the backend to push the disk
geometry via virtio pci config option.  Keep the old geo code around for
compatibility.

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (modified to single struct)
2008-05-02 21:50:51 +10:00
Rusty Russell c45a6816c1 virtio: explicit advertisement of driver features
A recent proposed feature addition to the virtio block driver revealed
some flaws in the API: in particular, we assume that feature
negotiation is complete once a driver's probe function returns.

There is nothing in the API to require this, however, and even I
didn't notice when it was violated.

So instead, we require the driver to specify what features it supports
in a table, we can then move the feature negotiation into the virtio
core.  The intersection of device and driver features are presented in
a new 'features' bitmap in the struct virtio_device.

Note that this highlights the difference between Linux unsigned-long
bitmaps where each unsigned long is in native endian, and a
straight-forward little-endian array of bytes.

Drivers can still remove feature bits in their probe routine if they
really have to.

API changes:
- dev->config->feature() no longer gets and acks a feature.
- drivers should advertise their features in the 'feature_table' field
- use virtio_has_feature() for extra sanity when checking feature bits

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-02 21:50:50 +10:00
Rusty Russell 72e61eb40b virtio: change config to guest endian.
A recent proposed feature addition to the virtio block driver revealed
some flaws in the API, in particular how easy it is to break big
endian machines.

The virtio config space was originally chosen to be little-endian,
because we thought the config might be part of the PCI config space
for virtio_pci.  It's actually a separate mmio region, so that
argument holds little water; as only x86 is currently using the virtio
mechanism, we can change this (but must do so now, before the
impending s390 merge).

API changes:
- __virtio_config_val() just becomes a striaght vdev->config_get() call.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-02 21:50:50 +10:00
Marcelo Tosatti 2e895e4c23 virtio-blk: fix remove oops
Do not unregister the major at device remove, since there might be
another device instances around.

(qemu) pci_del 0 11
(qemu) ACPI: PCI interrupt for device 0000:00:0b.0 disabled
(qemu) pci_del 0 10
(qemu) ------------[ cut here ]------------
WARNING: at block/genhd.c:126 unregister_blkdev+0x74/0x9e()
ACPI: PCI interrupt for device 0000:00:0a.0 disabled

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-02 21:50:46 +10:00
Rusty Russell cb38fa23c1 virtio: de-structify virtio_block status byte
Ron Minnich points out that a struct containing a char is not always
sizeof(char); simplest to remove the structure to avoid confusion.

Cc: "ron minnich" <rminnich@gmail.com>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-02 21:50:45 +10:00
Denis V. Lunev 3dfcf9c4bf cciss: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-01 08:04:02 -07:00
Laurent Vivier d7853d1f89 brd: modify ramdisk device to be able to manage partitions
This patch adds partition management for Block RAM Device (BRD).

This patch is done to keep in sync BRD and loop device drivers.

This patch adds a parameter to the module, max_part, to specify
the maximum number of partitions per RAM device.

Example:

# modprobe brd max_part=63
# ls -l /dev/ram*
brw-rw---- 1 root disk 1,   0 2008-04-03 13:39 /dev/ram0
brw-rw---- 1 root disk 1,  64 2008-04-03 13:39 /dev/ram1
brw-rw---- 1 root disk 1, 640 2008-04-03 13:39 /dev/ram10
brw-rw---- 1 root disk 1, 704 2008-04-03 13:39 /dev/ram11
brw-rw---- 1 root disk 1, 768 2008-04-03 13:39 /dev/ram12
brw-rw---- 1 root disk 1, 832 2008-04-03 13:39 /dev/ram13
brw-rw---- 1 root disk 1, 896 2008-04-03 13:39 /dev/ram14
brw-rw---- 1 root disk 1, 960 2008-04-03 13:39 /dev/ram15
brw-rw---- 1 root disk 1, 128 2008-04-03 13:39 /dev/ram2
brw-rw---- 1 root disk 1, 192 2008-04-03 13:39 /dev/ram3
brw-rw---- 1 root disk 1, 256 2008-04-03 13:39 /dev/ram4
brw-rw---- 1 root disk 1, 320 2008-04-03 13:39 /dev/ram5
brw-rw---- 1 root disk 1, 384 2008-04-03 13:39 /dev/ram6
brw-rw---- 1 root disk 1, 448 2008-04-03 13:39 /dev/ram7
brw-rw---- 1 root disk 1, 512 2008-04-03 13:39 /dev/ram8
brw-rw---- 1 root disk 1, 576 2008-04-03 13:39 /dev/ram9
# fdisk /dev/ram0
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): o
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-2, default 1): 1
Last cylinder or +size or +sizeM or +sizeK (1-2, default 2): 2

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
# ls -l /dev/ram0*
brw-rw---- 1 root disk 1, 0 2008-04-03 13:40 /dev/ram0
brw-rw---- 1 root disk 1, 1 2008-04-03 13:40 /dev/ram0p1
# mkfs /dev/ram0p1
mke2fs 1.40-WIP (14-Nov-2006)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
4016 inodes, 16032 blocks
801 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=16515072
2 block groups
8192 blocks per group, 8192 fragments per group
2008 inodes per group
Superblock backups stored on blocks:
	8193

Writing inode tables: done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
# mount /dev/ram0p1 /mnt
df /mnt
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/ram0p1              15521       138     14582   1% /mnt
# ls -l /mnt
total 12
drwx------ 2 root root 12288 2008-04-03 13:41 lost+found
# umount /mnt
# rmmod brd

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30 08:29:53 -07:00
Linus Torvalds bd5d435a96 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Skip I/O merges when disabled
  block: add large command support
  block: replace sizeof(rq->cmd) with BLK_MAX_CDB
  ide: use blk_rq_init() to initialize the request
  block: use blk_rq_init() to initialize the request
  block: rename and export rq_init()
  block: no need to initialize rq->cmd with blk_get_request
  block: no need to initialize rq->cmd in prepare_flush_fn hook
  block/blk-barrier.c:blk_ordered_cur_seq() mustn't be inline
  block/elevator.c:elv_rq_merge_ok() mustn't be inline
  block: make queue flags non-atomic
  block: add dma alignment and padding support to blk_rq_map_kern
  unexport blk_max_pfn
  ps3disk: Remove superfluous cast
  block: make rq_init() do a full memset()
  relay: fix splice problem
2008-04-29 08:18:03 -07:00
Harvey Harrison f885f8d127 drivers/block: use get_unaligned_* helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Ed L. Cashin <ecashin@coraid.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:27 -07:00
Hirofumi Nakagawa 801678c5a3 Remove duplicated unlikely() in IS_ERR()
Some drivers have duplicated unlikely() macros.  IS_ERR() already has
unlikely() in itself.

This patch cleans up such pointless code.

Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:25 -07:00
Adrian Bunk 0302190411 remove aoedev_isbusy()
Remove the no longer used aoedev_isbusy().

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:24 -07:00
Laurent Vivier d71a6d7332 NBD: add partition support
Permit the use of partitions with network block devices (NBD).

A new parameter is introduced to define how many partition we want to be able
to manage per network block device.  This parameter is "max_part".

For instance, to manage 63 partitions / loop device, we will do:

   [on the server side]
# nbd-server 1234 /dev/sdb
   [on the client side]
# modprobe nbd max_part=63
# ls -l /dev/nbd*
brw-rw---- 1 root disk 43,   0 2008-03-25 11:14 /dev/nbd0
brw-rw---- 1 root disk 43,  64 2008-03-25 11:11 /dev/nbd1
brw-rw---- 1 root disk 43, 640 2008-03-25 11:11 /dev/nbd10
brw-rw---- 1 root disk 43, 704 2008-03-25 11:11 /dev/nbd11
brw-rw---- 1 root disk 43, 768 2008-03-25 11:11 /dev/nbd12
brw-rw---- 1 root disk 43, 832 2008-03-25 11:11 /dev/nbd13
brw-rw---- 1 root disk 43, 896 2008-03-25 11:11 /dev/nbd14
brw-rw---- 1 root disk 43, 960 2008-03-25 11:11 /dev/nbd15
brw-rw---- 1 root disk 43, 128 2008-03-25 11:11 /dev/nbd2
brw-rw---- 1 root disk 43, 192 2008-03-25 11:11 /dev/nbd3
brw-rw---- 1 root disk 43, 256 2008-03-25 11:11 /dev/nbd4
brw-rw---- 1 root disk 43, 320 2008-03-25 11:11 /dev/nbd5
brw-rw---- 1 root disk 43, 384 2008-03-25 11:11 /dev/nbd6
brw-rw---- 1 root disk 43, 448 2008-03-25 11:11 /dev/nbd7
brw-rw---- 1 root disk 43, 512 2008-03-25 11:11 /dev/nbd8
brw-rw---- 1 root disk 43, 576 2008-03-25 11:11 /dev/nbd9
# nbd-client localhost 1234 /dev/nbd0
Negotiation: ..size = 80418240KB
bs=1024, sz=80418240

-------NOTE, RFC: partition table is not automatically read.
The driver sets bdev->bd_invalidated to 1 to force the read of the partition
table of the device, but this is done only on an open of the device.
So we have to do a "touch /dev/nbdX" or something like that.
It can't be done from the nbd-client or nbd driver because at this
level we can't ask to read the partition table and to serve the request
at the same time (-> deadlock)

If someone has a better idea, I'm open to any suggestion.
-------NOTE, RFC

# fdisk -l /dev/nbd0

Disk /dev/nbd0: 82.3 GB, 82348277760 bytes
255 heads, 63 sectors/track, 10011 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

     Device Boot      Start         End      Blocks   Id  System
/dev/nbd0p1   *           1        9965    80043831   83  Linux
/dev/nbd0p2            9966       10011      369495    5  Extended
/dev/nbd0p5            9966       10011      369463+  82  Linux swap / Solaris

# ls -l /dev/nbd0*
brw-rw---- 1 root disk 43,   0 2008-03-25 11:16 /dev/nbd0
brw-rw---- 1 root disk 43,   1 2008-03-25 11:16 /dev/nbd0p1
brw-rw---- 1 root disk 43,   2 2008-03-25 11:16 /dev/nbd0p2
brw-rw---- 1 root disk 43,   5 2008-03-25 11:16 /dev/nbd0p5
# mount /dev/nbd0p1 /mnt
# ls /mnt
bin    dev   initrd      lost+found  opt   sbin     sys  var
boot   etc   initrd.img  media       proc  selinux  tmp  vmlinuz
cdrom  home  lib         mnt         root  srv      usr
# umount /mnt
# nbd-client -d /dev/nbd0
# ls -l /dev/nbd0*
brw-rw---- 1 root disk 43, 0 2008-03-25 11:16 /dev/nbd0
-------NOTE
On "nbd-client -d", we can do an iocl(BLKRRPART) to update partition table:
as the size of the device is 0, we don't have to serve the partition manager
request (-> no deadlock).
-------NOTE

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:23 -07:00
Laurent Vivier 48cf6061b3 NBD: allow nbd to be used locally
This patch allows Network Block Device to be mounted locally (nbd-client to
nbd-server over 127.0.0.1).

It creates a kthread to avoid the deadlock described in NBD tools
documentation.  So, if nbd-client hangs waiting for pages, the kblockd thread
can continue its work and free pages.

I have tested the patch to verify that it avoids the hang that always occurs
when writing to a localhost nbd connection.  I have also tested to verify that
no performance degradation results from the additional thread and queue.

Patch originally from Laurent Vivier.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:23 -07:00
Denis V. Lunev c7705f3449 drivers: use non-racy method for proc entries creation (2)
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Neil Brown <neilb@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:22 -07:00
Alexey Dobriyan 928b4d8c89 proc: remove proc_root_driver
Use creation by full path: "driver/foo".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:18 -07:00
Harvey Harrison afe42d7dea xen: make blkif_getgeo static
Introduced between 2.6.25-rc2 and -rc3
drivers/block/xen-blkfront.c:139:5: warning: symbol 'blkif_getgeo' was not declared. Should it be static?

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:06 -07:00
Jon Schindler 7afea3bcb1 drivers/block/floppy.c: replace init_module&cleanup_module with module_init&module_exit
Replace init_module and cleanup_module with static functions and
module_init/module_exit.

Signed-off-by: Jon Schindler <jkschind@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29 08:06:03 -07:00
FUJITA Tomonori 4f54eec831 block: use blk_rq_init() to initialize the request
Any path needs to call it to initialize the request.

This is a preparation for large command support, which needs to
initialize the request in a proper way (that is, just doing a memset()
will not work).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-29 14:48:55 +02:00
FUJITA Tomonori 992b5bceee block: no need to initialize rq->cmd with blk_get_request
blk_get_request initializes rq->cmd (rq_init does) so the users don't
need to do that.

The purpose of this patch is to remove sizeof(rq->cmd) and &rq->cmd,
as a preparation for large command support, which changes rq->cmd from
the static array to a pointer. sizeof(rq->cmd) will not make sense and
&rq->cmd won't work.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-29 14:48:55 +02:00
FUJITA Tomonori 4917fa2925 block: no need to initialize rq->cmd in prepare_flush_fn hook
The block layer initializes rq->cmd (queue_flush calls rq_init) so
prepare_flush_fn hooks don't need to do that.

The purpose of this patch is to remove sizeof(rq->cmd), as a
preparation for large command support, which changes rq->cmd from the
static array to a pointer. sizeof(rq->cmd) will not make sense.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-29 14:48:54 +02:00
Nick Piggin 75ad23bc0f block: make queue flags non-atomic
We can save some atomic ops in the IO path, if we clearly define
the rules of how to modify the queue flags.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-29 14:48:33 +02:00
Geert Uytterhoeven 31e103c595 ps3disk: Remove superfluous cast
As ps3disk is a ppc64-only driver, sector_t equals to unsigned long, and the
cast is not needed.

Reuse in another (possibly 32-bit) driver is protected by the safety net called
`compiler warning' (with the cast, it may silently truncate to 32-bit).
If sector_t ever changes, we will get a compiler warning as well (with the
cast, we won't).

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-29 09:50:34 +02:00
Jared Hulbert 30afcb4bd2 return pfn from direct_access, for XIP
Alter the block device ->direct_access() API to work with the new
get_xip_mem() API (that requires both kaddr and pfn are returned).

Some architectures will not do the right thing in their virt_to_page() for use
by XIP (to translate from the kernel virtual address returned by
direct_access(), to a user mappable pfn in XIP's page fault handler.

However, we can't switch it to just return the pfn and not the kaddr, because
we have no good way to get a kva from a pfn, and XIP requires the kva for its
read(2) and write(2) handlers.  So we have to return both.

Signed-off-by: Jared Hulbert <jaredeh@gmail.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-mm@kvack.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28 08:58:23 -07:00
Mark McLoughlin 4f93f09b72 xen: Add compatibility aliases for frontend drivers
Before getting merged, xen-blkfront was xenblk and
xen-netfront was xennet.

Temporarily adding compatibility module aliases
eases upgrades from older versions by e.g. allowing
mkinitrd to find the new version of the module.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24 23:57:33 +02:00
Mark McLoughlin d2f0c52bec xen: Module autoprobing support for frontend drivers
Add module aliases to support autoprobing modules
for xen frontend devices.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24 23:57:33 +02:00
Christian Limpach 1d78d70556 xen blkfront: Delay wait for block devices until after the disk is added
When the xen block frontend driver is built as a module the module load
is only synchronous up to the point where the frontend and the backend
become connected rather than when the disk is added.

This means that there can be a race on boot between loading the module and
loading the dm-* modules and doing the scan for LVM physical volumes (all
in the initrd). In the failure case the disk is not present until after the
scan for physical volumes is complete.

Taken from:

  http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/11483a00c017

Signed-off-by: Christian Limpach <Christian.Limpach@xensource.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24 23:57:33 +02:00
Jeremy Fitzhardinge 53f0e8afcb xen/blkfront: use bdget_disk
info->dev is never initialized to anything, so bdget(info->dev) is
meaningless.  Get rid of info->dev, and use bdget_disk on the gendisk.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24 23:57:33 +02:00
Markus Armbruster 3e334239d8 xen: Make xen-blkfront write its protocol ABI to xenstore
Frontends are expected to write their protocol ABI to xenstore.  Since
the protocol ABI defaults to the backend's native ABI, things work
fine without that as long as the frontend's native ABI is identical to
the backend's native ABI.  This is not the case for xen-blkfront
running 32-on-64, because its ABI differs between 32 and 64 bit, and
thus needs this fix.

Based on http://xenbits.xensource.com/xen-unstable.hg?rev/c545932a18f3
and http://xenbits.xensource.com/xen-unstable.hg?rev/ffe52263b430 by
Gerd Hoffmann <kraxel@suse.de>

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24 23:57:32 +02:00
Petr Tesarik 26defe34e4 fix brd allocation flags
While looking at the implementation of the Ram backed block device
driver, I stumbled across a write-only local variable, which makes
little sense, so I assume it should actually work like this:

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-22 13:38:03 -07:00
Linus Torvalds 548453fd10 Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block:
  block: fix blk_register_queue() return value
  block: fix memory hotplug and bouncing in block layer
  block: replace remaining __FUNCTION__ occurrences
  Kconfig: clean up block/Kconfig help descriptions
  cciss: fix warning oops on rmmod of driver
  cciss: Fix race between disk-adding code and interrupt handler
  block: move the padding adjustment to blk_rq_map_sg
  block: add bio_copy_user_iov support to blk_rq_map_user_iov
  block: convert bio_copy_user to bio_copy_user_iov
  loop: manage partitions in disk image
  cdrom: use kmalloced buffers instead of buffers on stack
  cdrom: make unregister_cdrom() return void
  cdrom: use list_head for cdrom_device_info list
  cdrom: protect cdrom_device_info list by mutex
  cdrom: cleanup hardcoded error-code
  cdrom: remove ifdef CONFIG_SYSCTL
2008-04-21 16:03:40 -07:00
Linus Torvalds 9a64388d83 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (202 commits)
  [POWERPC] Fix compile breakage for 64-bit UP configs
  [POWERPC] Define copy_siginfo_from_user32
  [POWERPC] Add compat handler for PTRACE_GETSIGINFO
  [POWERPC] i2c: Fix build breakage introduced by OF helpers
  [POWERPC] Optimize fls64() on 64-bit processors
  [POWERPC] irqtrace support for 64-bit powerpc
  [POWERPC] Stacktrace support for lockdep
  [POWERPC] Move stackframe definitions to common header
  [POWERPC] Fix device-tree locking vs. interrupts
  [POWERPC] Make pci_bus_to_host()'s struct pci_bus * argument const
  [POWERPC] Remove unused __max_memory variable
  [POWERPC] Simplify xics direct/lpar irq_host setup
  [POWERPC] Use pseries_setup_i8259_cascade() in pseries_mpic_init_IRQ()
  [POWERPC] Turn xics_setup_8259_cascade() into a generic pseries_setup_i8259_cascade()
  [POWERPC] Move xics_setup_8259_cascade() into platforms/pseries/setup.c
  [POWERPC] Use asm-generic/bitops/find.h in bitops.h
  [POWERPC] 83xx: mpc8315 - fix USB UTMI Host setup
  [POWERPC] 85xx: Fix the size of qe muram for MPC8568E
  [POWERPC] 86xx: mpc86xx_hpcn - Temporarily accept old dts node identifier.
  [POWERPC] 86xx: mark functions static, other minor cleanups
  ...
2008-04-21 15:50:49 -07:00
Harvey Harrison cece933994 block: replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21 09:51:04 +02:00
scameron@beardog.cca.cpqcorp.net 6195057f58 cciss: fix warning oops on rmmod of driver
* Fix oops on cciss rmmod due to calling pci_free_consistent with
  irqs disabled.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21 09:50:09 +02:00
scameron@beardog.cca.cpqcorp.net e14ac67026 cciss: Fix race between disk-adding code and interrupt handler
Fix race condition between cciss_init_one(), cciss_update_drive_info(),
and cciss_check_queues().

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21 09:50:09 +02:00
Laurent Vivier 476a4813cf loop: manage partitions in disk image
This patch allows to use loop device with partitionned disk image.

Original behavior of loop is not modified.

A new parameter is introduced to define how many partition we want to be
able to manage per loop device. This parameter is "max_part".

For instance, to manage 63 partitions / loop device, we will do:
# modprobe loop max_part=63
# ls -l /dev/loop?*
brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7

And to attach a raw partitionned disk image, the original losetup is used:

# losetup -f etch.img
# ls -l /dev/loop?*
brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
brw-rw---- 1 root disk 7,   1 2008-03-05 14:57 /dev/loop0p1
brw-rw---- 1 root disk 7,   2 2008-03-05 14:57 /dev/loop0p2
brw-rw---- 1 root disk 7,   5 2008-03-05 14:57 /dev/loop0p5
brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7
# mount /dev/loop0p1 /mnt
# ls /mnt
bench  cdrom  home        lib         mnt   root     srv  usr
bin    dev    initrd      lost+found  opt   sbin     sys  var
boot   etc    initrd.img  media       proc  selinux  tmp  vmlinuz
# umount /mnt
# losetup -d /dev/loop0

Of course, the same behavior can be done using kpartx on a loop device,
but modifying loop avoids to stack several layers of block device (loop +
device mapper), this is a very light modification (40% of modifications
are to manage the new parameter).

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-04-21 09:50:08 +02:00
Matthew Wilcox d3135846f6 drivers: Remove unnecessary inclusions of asm/semaphore.h
None of these files use any of the functionality promised by
asm/semaphore.h.  It's possible that they rely on it dragging in some
unrelated header file, but I can't build all these files, so we'll have
fix any build failures as they come up.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-18 22:16:32 -04:00
David S. Miller 1e42198609 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-04-17 23:56:30 -07:00
Paul Mackerras ac7c5353b1 Merge branch 'linux-2.6' 2008-04-14 21:11:02 +10:00
Mike Pagano 231bc2a222 cciss: error: implicit declaration of function 'sg_init_table'
This patch adds the missing include directive <linux/scatterlist.h> to the
cciss.c source file.    This was discovered by our release team when building
the kernel for the Alpha architecture.

Errors were found as references to functions 'sg_init_table' and 'sg_page' do
not exist without the include for Alpha.

Signed-off-by: Mike Pagano <mpagano@gentoo.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-11 08:06:44 -07:00
Pete Zaitcev ef45cb624b ub: remove BUG() after __blk_end_request and fix the condition causing it
When __blk_end_request returns nonzero, it means that the request was
not completely processed and some BIOs are still attached. Since we
have dequeued it by that time, it means leaking requests and hanging
processes, which is why BUG() was in there. In ub this happens if
a packet request ends normally, but with residue (e.g. when scsi_id
issues INQUIRY).

The fix is to make sure that arguments passed to __blk_end_request
are correct: the full request length and not just transferred length.
The transferred length is indicated to applications by adjusting
rq->data_len with old, unchanged code outside of this patch.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Cc: Greg KH <greg@kroah.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-08 18:25:52 -07:00
David S. Miller 3bb5da3837 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-04-03 14:33:42 -07:00
Mike Snitzer ffc41cf8db nbd: prevent sock_xmit from attempting to use a NULL socket
NBD does not protect the nbd_device's socket from becoming NULL during
receives.

This closes a race with the NBD_CLEAR_SOCK ioctl (nbd-client -d) setting
the nbd_device's socket to NULL right before NBD calls sock_xmit.

Signed-off-by: Mike Snitzer <snitzer@gmail.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-02 15:28:19 -07:00
Julia Lawall ea6728c11f [POWERPC] Use FIELD_SIZEOF in drivers/block/viodasd.c
Robert P.J. Day proposed to use the macro FIELD_SIZEOF in replace of code
that matches its definition.

The modification was made using the following semantic patch
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@haskernel@
@@

#include <linux/kernel.h>

@depends on haskernel@
type t;
identifier f;
@@

- (sizeof(((t*)0)->f))
+ FIELD_SIZEOF(t, f)

@depends on haskernel@
type t;
identifier f;
@@

- sizeof(((t*)0)->f)
+ FIELD_SIZEOF(t, f)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-01 20:43:10 +11:00
YOSHIFUJI Hideaki c346dca108 [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.
Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:53 +09:00
Linus Torvalds 92f53c6f1e Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Revert "unexport bio_{,un}map_user"
  relay: fix subbuf_splice_actor() adding too many pages
  The ps2esdi driver was marked as BROKEN more than two years ago due to being
2008-03-18 07:43:14 -07:00
Jeremy Katz c483934670 virtio: Fix sysfs bits to have proper block symlink
Fix up so that the virtio_blk devices in sysfs link correctly to their
block device.  This then allows them to be detected by hal, etc

Signed-off-by: Jeremy Katz <katzj@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-17 22:58:15 +11:00
Adrian Bunk 2af3e6017e The ps2esdi driver was marked as BROKEN more than two years ago due to being
no longer working for some time.

A driver that had been marked as BROKEN for such a long time seems to be
unlikely to be revived in the forseeable future.

But if anyone wants to ever revive this driver, the code is still present in
the older kernel releases.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Alan Cox <alan@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-17 09:03:05 +01:00
Jiri Slaby f2005e1777 block: floppy: fix rmmod lockup
Floppy rmmod locks up when no such hardware was initialized, since there is
nobody to wake the remove code up.  Remove the completion, because release is
called during platform_unregister anyway.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-13 13:11:43 -07:00
Benjamin Herrenschmidt 25c0a7b832 [POWERPC] Fix viodasd driver with scatterlist debug
The iSeries viodasd drivers does some very strange things with
scatterlists, one of these causing a BUG_ON to trigger when
scatterlist debugging is enabled due to initializing the
scatterlist with memset instead of sg_init_table().

This fixes it by using sg_init_table().  The rest of the stuff
it does to that poor list is still pretty awful but it will work.

I may look into fixing things in a nicer way some other time.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-03-13 10:09:28 +11:00
Peter Osterlund 05680d86d2 pktcdvd: reduce stack consumption
On my system, pkt_open() consumes 584 bytes because the compiler decides to
inline lots of functions that would not normally be part of long call chains.
The following patch fixes that problem on my system.

Signed-off-by: Peter Osterlund <petero2@telia.com>
Cc: Nix <nix@esperi.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04 16:35:12 -08:00
Mike Miller 68d95b585f cciss: remove READ_AHEAD define and use block layer defaults
This patch removes the #define READ_AHEAD 1024 from the driver and uses the
block layer defaults, instead. We have found that under certain workloads
the setting can cause a disk connected to the e200 controller to go offline.
If the disk hiccups the link may try to downshift but the controller is
never notified that the link successfully completed the renegotiation.
We've also found that performance using the block layer default of 32 pages
was on par with the 1024 setting. We tried setting it to zero at one time
based on info from our firmware guys but that killed performance. Turns out
we were talking about 2 different read ahead settings.
Please consider this for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-04 11:28:43 +01:00
Mike Miller 89b6e74378 resubmit: cciss: procfs updates to display info about many
volumes

This patch allows us to display information about all of the logical volumes
configured on a particular controller without stepping on memory even when
there are many volumes (128 or more) configured.
Please consider this for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-03-04 11:14:39 +01:00
Paul Clements 48f15b93b2 NBD: make nbd default to deadline I/O scheduler
NBD doesn't work well with CFQ (or AS) schedulers, so let's default to
something else.

The two problems I have experienced with nbd and cfq are:

1) nbd hangs with cfq on RHEL 5 (2.6.18) -- this may well have been
   fixed

   There's a similar debian bug that has been filed as well:

   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=447638

   There have been posts to nbd-general mailing list about problems with
   cfq and nbd also.

2) nbd performs about 10% better (the last time I tested) with deadline
   vs.  cfq (the overhead of cfq doesn't provide much advantage to nbd [not
   being a real disk], and you end up going through the I/O scheduler on
   the nbd server anyway, so it makes sense that deadline is better with
   nbd)

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 17:12:15 -08:00
Ian Campbell 597592d951 xen: Implement getgeo for Xen virtual block device.
The below implements the getgeo hook for Xen block devices. Extracted
from the xen-unstable tree where it has been used for ages.

It is useful to have because it allows things like grub2 (used by the
Debian installer images) to work in a guest domain without having to
sprinkle Xen specific hacks around the place.

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-21 16:19:13 -08:00
Tony Breeds 2ebda63b09 Fix compile of swim3 as module
The current pmac32_defconfig fails to build with the following error:

  Building modules, stage 2.
ERROR: "check_media_bay" [drivers/block/swim3.ko] undefined!
WARNING: modpost: Found 23 section mismatch(es).
To see full details build your kernel with:
'make CONFIG_DEBUG_SECTION_MISMATCH=y'
make[2]: *** [__modpost] Error 1

This patch fixes that.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 20:58:04 -08:00
Pete Zaitcev 541645be8b ub: fix up the conversion to sg_init_table()
Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: "Oliver Pinter" <oliver.pntr@gmail.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-09 11:08:33 -08:00
Linus Torvalds 03054de1e0 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Enhanced partition statistics: documentation update
  Enhanced partition statistics: remove old partition statistics
  Enhanced partition statistics: procfs
  Enhanced partition statistics: sysfs
  Enhanced partition statistics: aoe fix
  Enhanced partition statistics: update partition statitics
  Enhanced partition statistics: core statistics
  block: fixup rq_init() a bit

Manually fixed conflict in drivers/block/aoe/aoecmd.c due to statistics
support.
2008-02-08 09:42:46 -08:00
Paul Clements 20a8143eaa NBD: remove limit on max number of nbd devices
Remove the arbitrary 128 device limit for NBD.  nbds_max can now be set to
any number.  In certain scenarios where devices are used sparsely we have
run into the 128 device limit.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:41 -08:00
Andrew Morton 476aed3870 aoe: statically initialise devlist_lock
I guess aoedev_init() can go away now.

Cc: Greg KH <greg@kroah.com>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00
Ed L. Cashin 52e112b3ab aoe: update copyright date
Update the year in the copyright notices.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00
Ed L. Cashin 578c4aa0b4 aoe: make error messages more specific
Andrew Morton pointed out that the "too many targets" message in patch 2 could
be printed for failing GFP_ATOMIC allocations.  This patch makes the messages
more specific.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00
Ed L. Cashin 1d75981a80 aoe: the aoeminor doesn't need a long format
The aoedev aoeminor member doesn't need a long format.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00
Ed L. Cashin 7df620d852 aoe: add module parameter for users who need more outstanding I/O
An AoE target provides an estimate of the number of outstanding commands that
the AoE initiator can send before getting a response.  The aoe_maxout
parameter provides a way to set an even lower limit.  It will not allow a user
to use more outstanding commands than the target permits.  If a user discovers
a problem with a large setting, this parameter provides a way for us to work
with them to debug the problem.  We expect to improve the dynamic window
sizing algorithm and drop this parameter.  For the time being, it is a
debugging aid.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00
Ed L. Cashin 6b9699bbd2 aoe: only install new AoE device once
An aoe driver user who had about 70 AoE targets found that he was hitting a
BUG in sysfs_create_file because the aoe driver was trying to tell the kernel
about an AoE device more than once.  Each AoE device was reachable by several
local network interfaces, and multiple ATA device indentify responses were
returning from that single device.

This patch eliminates a race condition so that aoe always informs the block
layer of a new AoE device once in the presence of multiple incoming ATA device
identify responses.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00
Ed L. Cashin 9bb237b6a6 aoe: dynamically allocate a capped number of skbs when necessary
What this Patch Does

  Even before this recent series of 12 patches to 2.6.22-rc4, the aoe
  driver was reusing a small set of skbs that were allocated once and
  were only used for outbound AoE commands.

  The network layer cannot be allowed to put_page on the data that is
  still associated with a bio we haven't returned to the block layer,
  so the aoe driver (even before the patch under discussion) is still
  the owner of skbs that have been handed to the network layer for
  transmission.  We need to keep track of these skbs so that we can
  free them, but by tracking them, we can also easily re-use them.

  The new patch was a response to the behavior of certain network
  drivers.  We cannot reuse an skb that the network driver still has
  in its transmit ring.  Network drivers can defer transmit ring
  cleanup and then use the state in the skb to determine how many data
  segments to clean up in its transmit ring.  The tg3 driver is one
  driver that behaves in this way.

  When the network driver defers cleanup of its transmit ring, the aoe
  driver can find itself in a situation where it would like to send an
  AoE command, and the AoE target is ready for more work, but the
  network driver still has all of the pre-allocated skbs.  In that
  case, the new patch just calls alloc_skb, as you'd expect.

  We don't want to get carried away, though.  We try not to do
  excessive allocation in the write path, so we cap the number of skbs
  we dynamically allocate.

  Probably calling it a "dynamic pool" is misleading.  We were already
  trying to use a small fixed-size set of pre-allocated skbs before
  this patch, and this patch just provides a little headroom (with a
  ceiling, though) to accomodate network drivers that hang onto skbs,
  by allocating when needed.  The d->skbpool_hd list of allocated skbs
  is necessary so that we can free them later.

  We didn't notice the need for this headroom until AoE targets got
  fast enough.

Alternatives

  If the network layer never did a put_page on the pages in the bio's
  we get from the block layer, then it would be possible for us to
  hand skbs to the network layer and forget about them, allowing the
  network layer to free skbs itself (and thereby calling our own
  skb->destructor callback function if we needed that).  In that case
  we could get rid of the pre-allocated skbs and also the
  d->skbpool_hd, instead just calling alloc_skb every time we wanted
  to transmit a packet.  The slab allocator would effectively maintain
  the list of skbs.

  Besides a loss of CPU cache locality, the main concern with that
  approach the danger that it would increase the likelihood of
  deadlock when VM is trying to free pages by writing dirty data from
  the page cache through the aoe driver out to persistent storage on
  an AoE device.  Right now we have a situation where we have
  pre-allocation that corresponds to how much we use, which seems
  ideal.

  Of course, there's still the separate issue of receiving the packets
  that tell us that a write has successfully completed on the AoE
  target.  When memory is low and VM is using AoE to flush dirty data
  to free up pages, it would be perfect if there were a way for us to
  register a fast callback that could recognize write command
  completion responses.  But I don't think the current problems with
  the receive side of the situation are a justification for
  exacerbating the problem on the transmit side.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:32 -08:00
Ed L. Cashin 262bf54144 aoe: user can ask driver to forget previously detected devices
When an AoE device is detected, the kernel is informed, and a new block device
is created.  If the device is unused, the block device corresponding to remote
device that is no longer available may be removed from the system by telling
the aoe driver to "flush" its list of devices.

Without this patch, software like GPFS and LVM may attempt to read from AoE
devices that were discovered earlier but are no longer present, blocking until
the I/O attempt times out.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Ed L. Cashin cf446f0dba aoe: eliminate goto and improve readability
Adam Richter suggested eliminating this goto.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Ed L. Cashin 1eb0da4cea aoe: mac_addr: avoid 64-bit arch compiler warnings
By returning unsigned long long, mac_addr does not generate compiler warnings
on 64-bit architectures.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Ed L. Cashin 68e0d42f39 aoe: handle multiple network paths to AoE device
A remote AoE device is something can process ATA commands and is identified by
an AoE shelf number and an AoE slot number.  Such a device might have more
than one network interface, and it might be reachable by more than one local
network interface.  This patch tracks the available network paths available to
each AoE device, allowing them to be used more efficiently.

Andrew Morton asked about the call to msleep_interruptible in the revalidate
function.  Yes, if a signal is pending, then msleep_interruptible will not
return 0.  That means we will not loop but will call aoenet_xmit with a NULL
skb, which is a noop.  If the system is too low on memory or the aoe driver is
too low on frames, then the user can hit control-C to interrupt the attempt to
do a revalidate.  I have added a comment to the code summarizing that.

Andrew Morton asked whether the allocation performed inside addtgt could use a
more relaxed allocation like GFP_KERNEL, but addtgt is called when the aoedev
lock has been locked with spin_lock_irqsave.  It would be nice to allocate the
memory under fewer restrictions, but targets are only added when the device is
being discovered, and if the target can't be added right now, we can try again
in a minute when then next AoE config query broadcast goes out.

Andrew Morton pointed out that the "too many targets" message could be printed
for failing GFP_ATOMIC allocations.  The last patch in this series makes the
messages more specific.

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Ed L. Cashin 8911ef4dc9 aoe: bring driver version number to 47
Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:31 -08:00
Nick Piggin 75acb9cd2e rd: support XIP
Support direct_access XIP method with brd.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:30 -08:00
Nick Piggin 9db5579be4 rewrite rd
This is a rewrite of the ramdisk block device driver.

The old one is really difficult because it effectively implements a block
device which serves data out of its own buffer cache.  It relies on the dirty
bit being set, to pin its backing store in cache, however there are non
trivial paths which can clear the dirty bit (eg.  try_to_free_buffers()),
which had recently lead to data corruption.  And in general it is completely
wrong for a block device driver to do this.

The new one is more like a regular block device driver.  It has no idea about
vm/vfs stuff.  It's backing store is similar to the buffer cache (a simple
radix-tree of pages), but it doesn't know anything about page cache (the pages
in the radix tree are not pagecache pages).

There is one slight downside -- direct block device access and filesystem
metadata access goes through an extra copy and gets stored in RAM twice.
However, this downside is only slight, because the real buffercache of the
device is now reclaimable (because we're not playing crazy games with it), so
under memory intensive situations, footprint should effectively be the same --
maybe even a slight advantage to the new driver because it can also reclaim
buffer heads.

The fact that it now goes through all the regular vm/fs paths makes it
much more useful for testing, too.

   text    data     bss     dec     hex filename
   2837     849     384    4070     fe6 drivers/block/rd.o
   3528     371      12    3911     f47 drivers/block/brd.o

Text is larger, but data and bss are smaller, making total size smaller.

A few other nice things about it:
- Similar structure and layout to the new loop device handlinag.
- Dynamic ramdisk creation.
- Runtime flexible buffer head size (because it is no longer part of the
  ramdisk code).
- Boot / load time flexible ramdisk size, which could easily be extended
  to a per-ramdisk runtime changeable size (eg. with an ioctl).
- Can use highmem for the backing store.

[akpm@linux-foundation.org: fix build]
[byron.bbradley@gmail.com: make rd_size non-static]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Byron Bradley <byron.bbradley@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:30 -08:00
Jerome Marchand a890d62b9e Enhanced partition statistics: aoe fix
Updates the enhanced partition statistics in ATA over Ethernet driver
(not tested).

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
2008-02-08 12:41:57 +01:00
Linus Torvalds 3796958130 Merge branch 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (69 commits)
  [POWERPC] Add SPE registers to core dumps
  [POWERPC] Use regset code for compat PTRACE_*REGS* calls
  [POWERPC] Use generic compat_sys_ptrace
  [POWERPC] Use generic compat_ptrace_request
  [POWERPC] Use generic ptrace peekdata/pokedata
  [POWERPC] Use regset code for PTRACE_*REGS* requests
  [POWERPC] Switch to generic compat_binfmt_elf code
  [POWERPC] Switch to using user_regset-based core dumps
  [POWERPC] Add user_regset compat support
  [POWERPC] Add user_regset_view definitions
  [POWERPC] Use user_regset accessors for GPRs
  [POWERPC] ptrace accessors for special regs MSR and TRAP
  [POWERPC] Use user_regset accessors for SPE regs
  [POWERPC] Use user_regset accessors for altivec regs
  [POWERPC] Use user_regset accessors for FP regs
  [POWERPC] mpc52xx: fix compile error introduce when rebasing patch
  [POWERPC] 4xx: PCIe indirect DCR spinlock fix.
  [POWERPC] Add missing native dcr dcr_ind_lock spinlock
  [POWERPC] 4xx: Fix offset value on Warp board
  [POWERPC] 4xx: Add 440EPx Sequoia ehci dts entry
  ...
2008-02-07 09:02:26 -08:00
Josh Boyer 256ae6a720 Merge branch 'virtex-for-2.6.25' of git://git.secretlab.ca/git/linux-2.6-virtex into for-2.6.25 2008-02-06 21:06:45 -06:00
Geert Uytterhoeven 5ceadd2a2a Atari floppy: Rename disk_type to atari_disk_type
Commit edfaa7c365

    Driver core: convert block from raw kobjects to core devices

    This moves the block devices to /sys/class/block. It will create a
    flat list of all block devices, with the disks and partitions in one
    directory. For compatibility /sys/block is created and contains symlinks
    to the disks.

introduced a global disk_type variable in <linux/genhd.h>, causing the
following compile error on Atari:

    drivers/block/ataflop.c:93: error: conflicting types for 'disk_type'
    include/linux/genhd.h:21: error: previous declaration of 'disk_type' was here

Rename the local disk_type variable in drivers/block/ataflop.c to
atari_disk_type, to avoid the conflict.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:10 -08:00
Randy Dunlap 582539e5a0 cciss: use upper_32_bits() macro to eliminate warnings
Use upper_32_bits(x) macro to handle shifts that may be >= the width of
the data type.

drivers/block/cciss.c: In function 'do_cciss_request':
drivers/block/cciss.c:2655: warning: right shift count >= width of type
drivers/block/cciss.c:2656: warning: right shift count >= width of type
drivers/block/cciss.c:2657: warning: right shift count >= width of type
drivers/block/cciss.c:2658: warning: right shift count >= width of type

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:03 -08:00
Robert P. J. Day 67a3b2b6ce rd: use is_power_of_2() in drivers/block/rd.c.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:03 -08:00
David Woodhouse 96c5865559 Allow auto-destruction of loop devices
This allows a flag to be set on loop devices so that when they are
closed for the last time, they'll self-destruct.

In general, so that we can automatically allocate loop devices (as with
losetup -f) and have them disappear when we're done with them.

In particular, right now, so that we can stop relying on the hackish
special-case in umount(8) which kills off loop devices which were set up by
'mount -oloop'.  That means we can stop putting crap in /etc/mtab which
doesn't belong there, which means it can be a symlink to /proc/mounts, which
means yet another writable file on the root filesystem is eliminated and the
'stateless' folks get happier...  and OLPC trac #356 can be closed.

The mount(8) side of that is at
http://marc.info/?l=util-linux-ng&m=119362955431694&w=2

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Bernardo Innocenti <bernie@codewiz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:41:01 -08:00
Alexey Dobriyan eaa0ff15c3 fix ! versus & precedence in various places
Fix various instances of

	if (!expr & mask)

which should probably have been

	if (!(expr & mask))

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06 10:40:59 -08:00
Stephen Neuendorffer 0e349b0e2d [POWERPC] Xilinx: Update compatible to use values generated by BSP generator.
Mainly, this involves two changes:
1) xilinx->xlnx (recognized standard is to use the stock ticker)
2) In order to have the device tree focus on describing what the
hardware is as exactly as possible, the compatible strings contain the
full IP name and IP version.

Signed-off-by: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-02-06 10:23:21 -07:00
Grant Likely 911a317599 [POWERPC] Fix incorrectly tagged __devinitdata structures
Fix compile errors in the xilinxfb, xsysace and uartlite drivers used
by the Xilinx Virtex platform

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
2008-02-06 10:23:12 -07:00
Linus Torvalds 93890b71a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (25 commits)
  virtio: balloon driver
  virtio: Use PCI revision field to indicate virtio PCI ABI version
  virtio: PCI device
  virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzz
  virtio_blk: Dont waste major numbers
  virtio_blk: provide getgeo
  virtio_net: parametrize the napi_weight for virtio receive queue.
  virtio: free transmit skbs when notified, not on next xmit.
  virtio: flush buffers on open
  virtnet: remove double ether_setup
  virtio: Allow virtio to be modular and used by modules
  virtio: Use the sg_phys convenience function.
  virtio: Put the virtio under the virtualization menu
  virtio: handle interrupts after callbacks turned off
  virtio: reset function
  virtio: populate network rings in the probe routine, not open
  virtio: Tweak virtio_net defines
  virtio: Net header needs hdr_len
  virtio: remove unused id field from struct virtio_blk_outhdr
  virtio: clarify NO_NOTIFY flag usage
  ...
2008-02-04 08:00:54 -08:00
Christian Borntraeger d50ed907dc virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzz
Am Freitag, 1. Februar 2008 schrieb Christian Borntraeger:
> Right. I will fix that with an additional patch.

This patch goes on top of the minor number patch. Please let me know if
you want a merged patch:

Currently virtio_blk creates the disk name combinging "vd"  with 'a'++.
This will give strange names after vdz. I have implemented names up to
vdzzz - inspired by the sd.c code. That should be sufficient for now.

There is one driver in the kernel (driver/s390/block/dasd_genhd.c) that
implements names from dasda-dasdzzzz allowing even more disks. Maybe
a janitor can come up with a common implementation usable for all kind
of block device drivers.

I have tested this patch with 100 disks - seems to work.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:50:11 +11:00
Christian Borntraeger 4f3bf19c6e virtio_blk: Dont waste major numbers
Rusty,

currently virtio_blk uses one major number per device. While this works
quite well on most systems it is wasteful and will exhaust major numbers
on larger installations.

This patch allocates a major number on init and will use 16 minor numbers
for each disk. That will allow ~64k virtio_blk disks.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:50:10 +11:00
Christian Borntraeger 135da0b037 virtio_blk: provide getgeo
Rusty,

I currently try to make my guest boot from an virtio root device
without having an external kernel. Some of the tools that I tried
expect HDIO_GETGEO to work. The most interesting value is likely
the geo.start value to get the offset of a partition. This value
is filled by block/ioctl.c if fops->getgeo is set. This patch also
fills in some standard values for heads, sectors and cylinders.

Makes sense?

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:50:09 +11:00
Anthony Liguori 0ad07ec1fd virtio: Put the virtio under the virtualization menu
This patch moves virtio under the virtualization menu and changes virtio
devices to not claim to only be for lguest.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:50:05 +11:00
Rusty Russell 6e5aa7efb2 virtio: reset function
A reset function solves three problems:

1) It allows us to renegotiate features, eg. if we want to upgrade a
   guest driver without rebooting the guest.

2) It gives us a clean way of shutting down virtqueues: after a reset,
   we know that the buffers won't be used by the host, and

3) It helps the guest recover from messed-up drivers.

So we remove the ->shutdown hook, and the only way we now remove
feature bits is via reset.

We leave it to the driver to do the reset before it deletes queues:
the balloon driver, for example, needs to chat to the host in its
remove function.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:50:03 +11:00
Rusty Russell 18445c4d50 virtio: explicit enable_cb/disable_cb rather than callback return.
It seems that virtio_net wants to disable callbacks (interrupts) before
calling netif_rx_schedule(), so we can't use the return value to do so.

Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback
now returns void, rather than a boolean.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:49:58 +11:00
Rusty Russell a586d4f601 virtio: simplify config mechanism.
Previously we used a type/len pair within the config space, but this
seems overkill.  We now simply define a structure which represents the
layout in the config space: the config space can now only be extended
at the end.

The main driver-visible changes:
1) We indicate what fields are present with an explicit feature bit.
2) Virtqueues are explicitly numbered, and not in the config space.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:49:57 +11:00
Joe Perches f66083c376 drivers/block/: Spelling fixes
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:09:38 +02:00
Pete Zaitcev eedffd12e0 USB: Remove unnecessary zeroing from ub
These zeroings were taken from usb-storage long time ago. I examined
the submission paths and usb_fill_bulk_urb and found them unnecessary.

Signed-off-by: Pete Zaitcev <zaitcev@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-01 14:34:47 -08:00
Adrian Bunk e30f98fcac block/sunvdc.c:print_version() must be __devinit
This patch fixes the following section mismatches:

<--  snip  -->

...
WARNING: drivers/block/sunvdc.o(.text+0xf0): Section mismatch in reference from the function print_version() to the variable .devinit.data:version
WARNING: drivers/block/sunvdc.o(.text+0xf8): Section mismatch in reference from the function print_version() to the variable .devinit.data:version
...

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-01 09:26:32 +01:00
Jens Axboe e7d9dc9cfd cciss: fix bug in overriding ->data_len before completion
For BLOCK_PC requests, we need that length for completing the request.
Andrew Vasquez <andrew.vasquez@qlogic.com> reported the following
oops

Hitting a consistent BUG() with recent Linus' linux-2.6.git:

	[   12.941428] ------------[ cut here ]------------
	[   12.944874] kernel BUG at drivers/block/cciss.c:1260!
	[   12.944874] invalid opcode: 0000 [1] SMP
	[   12.944874] CPU 0
	[   12.944874] Modules linked in:
	[   12.944874] Pid: 0, comm: swapper Not tainted 2.6.24 #43
	[   12.944874] RIP: 0010:[<ffffffff8039e43d>]  [<ffffffff8039e43d>] cciss_softirq_done+0xbc/0x1bf
	[   12.944874] RSP: 0018:ffffffff8063aed0  EFLAGS: 00010202
	[   12.944874] RAX: 0000000000000001 RBX: ffff8100cf800010 RCX: ffff81042f1253b0
	[   12.944874] RDX: ffff81042de398f0 RSI: ffff81042de398f0 RDI: 0000000000000001
	[   12.944874] RBP: ffff81042daa0000 R08: ffff81042f1253b0 R09: 0000000000000001
	[   12.944874] R10: 00000000000000fe R11: 0000000000000000 R12: 0000000000000002
	[   12.944874] R13: 0000000000000001 R14: ffff8100cf800000 R15: ffff81042de398f0
	[   12.944874] FS:  0000000000000000(0000) GS:ffffffff805bb000(0000) knlGS:0000000000000000
	[   12.944874] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
	[   12.944874] CR2: 00002afed7eea340 CR3: 000000042dbba000 CR4: 00000000000006e0
	[   12.944874] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
	[   12.944874] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
	[   12.944874] Process swapper (pid: 0, threadinfo ffffffff805f4000, task ffffffff805624a0)
	[   12.944874] Stack:  0000000000000000 ffffffff8063af10 0000000000000001 ffffffff80632d60
	[   12.944874]  0000000000000000 000000000000000a ffffffff805bb900 ffffffff8032038f
	[   12.944874]  ffffffff8063af10 ffffffff8063af10 ffffffff805bb940 ffffffff802346b4
	[   12.944874] Call Trace:
	[   12.944874]  <IRQ>  [<ffffffff8032038f>] blk_done_softirq+0x69/0x78
	[   12.944874]  [<ffffffff802346b4>] __do_softirq+0x6f/0xd8
	[   12.944874]  [<ffffffff8020c45c>] call_softirq+0x1c/0x30
	[   12.944874]  [<ffffffff8020e347>] do_softirq+0x30/0x80
	[   12.944874]  [<ffffffff8020e409>] do_IRQ+0x72/0xd9
	[   12.944874]  [<ffffffff8020a50a>] mwait_idle+0x0/0x46
	[   12.944874]  [<ffffffff8020a3da>] default_idle+0x0/0x3d
	[   12.944874]  [<ffffffff8020b7e1>] ret_from_intr+0x0/0xa
	[   12.944874]  <EOI>  [<ffffffff8020a54c>] mwait_idle+0x42/0x46
	[   12.944874]  [<ffffffff8020a481>] cpu_idle+0x6a/0xae
	[   12.944874]
	[   12.944874]
	[   12.944874] Code: 0f 0b eb fe 48 8d 85 d8 c0 00 00 48 89 04 24 48 89 c7 e8 e5
	[   12.944874] RIP  [<ffffffff8039e43d>] cciss_softirq_done+0xbc/0x1bf
	[   12.944874]  RSP <ffffffff8063aed0>
	[   12.944903] ---[ end trace e9c631603f90d22f ]---

which is caused by blk_end_request() returning 'not done' for a request,
since it gets asked to complete zero bytes.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-29 21:55:18 +01:00
Jens Axboe 9bf722598f xsysace: end request handling fix
In ace_fsm_dostate(), the variable 'i' was used only for passing
sector size of the request to end_that_request_first().
So I removed it and changed the code to pass the size in bytes
directly to __blk_end_request()

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-29 21:54:53 +01:00
Linus Torvalds e189f3495c Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (197 commits)
  sh: add spi header and r2d platform data V3
  sh: update r7780rp interrupt code
  sh: remove consistent alloc stuff from the machine vector
  sh: use declared coherent memory for dreamcast pci ethernet adapter
  sh: declared coherent memory support V2
  sh: Add support for SDK7780 board.
  sh: constify function pointer tables
  sh: Kill off -traditional for linker script.
  cdrom: Add support for Sega Dreamcast GD-ROM.
  sh: Kill off hs7751rvoip reference from arch/sh/Kconfig.
  sh: Drop r7780rp_defconfig, use r7780mp_defconfig as kbuild default.
  sh: Kill off dead HS771RVoIP board support.
  sh: r7785rp: Fix up DECLARE_INTC_DESC() arg mismatch.
  sh: r7785rp: Hook up the rest of the HL7785 FPGA IRQ vectors.
  sh: r2d - enable sm501 usb host function
  sh: remove voyagergx
  sh: r2d - add lcd planel timings to sm501 platform data
  sh: Add OHCI and UDC platform devices for SH7720.
  sh: intc - remove default interrupt priority tables
  sh: Correct pte size mismatch for X2 TLB.
  ...
2008-01-29 08:52:50 +11:00
Kiyoshi Ueda a65b58663d blk_end_request: changing xsysace (take 4)
This patch converts xsysace to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

xsysace is a little bit different from "normal" drivers.
xsysace driver has a state machine in it.
It calls end_that_request_first() and end_that_request_last()
from different states. (ACE_FSM_STATE_REQ_TRANSFER and
ACE_FSM_STATE_REQ_COMPLETE, respectively.)

However, those states are consecutive and without any interruption
inbetween.
So we can just follow the standard conversion rule (b) mentioned in
the patch subject "[PATCH 01/30] blk_end_request: add new request
completion interface".

Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:20 +01:00
Kiyoshi Ueda 7d699bafe2 blk_end_request: changing ub (take 4)
This patch converts ub to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

Cc: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:17 +01:00
Kiyoshi Ueda ea6f06f416 blk_end_request: changing cpqarray (take 4)
This patch converts cpqarray to use blk_end_request interfaces.
Related 'ok' arguments are converted to 'error'.

cpqarray is a little bit different from "normal" drivers.
cpqarray directly calls bio_endio() and disk_stat_add()
when completing request.  But those can be replaced with
__end_that_request_first().
After the replacement, request completion procedures of
those drivers become like the following:
    o end_that_request_first()
    o add_disk_randomness()
    o end_that_request_last()
This can be converted to __blk_end_request() by following
the rule (b) mentioned in the patch subject
"[PATCH 01/30] blk_end_request: add new request completion interface".

Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:00 +01:00
Kiyoshi Ueda 3daeea29f9 blk_end_request: changing cciss (take 4)
This patch converts cciss to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

cciss is a little bit different from "normal" drivers.
cciss directly calls bio_endio() and disk_stat_add()
when completing request.  But those can be replaced with
__end_that_request_first().
After the replacement, request completion procedures of
those drivers become like the following:
    o end_that_request_first()
    o add_disk_randomness()
    o end_that_request_last()
This can be converted to blk_end_request() by following
the rule (a) mentioned in the patch subject
"[PATCH 01/30] blk_end_request: add new request completion interface".

Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:58 +01:00
Kiyoshi Ueda f530f03637 blk_end_request: changing xen-blkfront (take 4)
This patch converts xen-blkfront to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:46 +01:00
Kiyoshi Ueda b2aec24ea4 blk_end_request: changing viodasd (take 4)
This patch converts viodasd to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

As a result, the interface of internal function, viodasd_end_request(),
is changed.

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:44 +01:00
Kiyoshi Ueda a9c73d05f1 blk_end_request: changing sx8 (take 4)
This patch converts sx8 to use blk_end_request interfaces.
Related 'uptodate' and 'is_ok' arguments are converted to 'error'.

As a result, the interfaces of internal functions below are changed.
  o carm_end_request_queued
  o carm_end_rq
  o carm_handle_array_info
  o carm_handle_scan_chan
  o carm_handle_generic
  o carm_handle_rw

The 'is_ok' is set at only one place in carm_handle_resp() below:

	int is_ok = (status == RMSG_OK);

And the value is propagated to all functions above, and no modification
in other places.
So the actual conversion of the 'is_ok' is done at only one place above.

Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:42 +01:00
Kiyoshi Ueda 5047c3c64e blk_end_request: changing sunvdc (take 4)
This patch converts sunvdc to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

As a result, the interface of internal function, vdc_end_request(),
is changed.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:40 +01:00
Kiyoshi Ueda f01ab252cb blk_end_request: changing ps3disk (take 4)
This patch converts ps3disk to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

Cc: Geoff Levand <geoffrey.levand@am.sony.com>
Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:38 +01:00
Kiyoshi Ueda 097c94a4e8 blk_end_request: changing nbd (take 4)
This patch converts nbd to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

Cc: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:37 +01:00
Kiyoshi Ueda 1c5093ba03 blk_end_request: changing floppy (take 4)
This patch converts floppy to use blk_end_request interfaces.
Related 'uptodate' arguments are converted to 'error'.

As a result, the interface of internal function, floppy_end_request(),
is changed.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:21 +01:00
Kiyoshi Ueda 0156c2547e blk_end_request: changing DAC960 (take 4)
This patch converts DAC960 to use blk_end_request interfaces.
Related 'UpToDate' arguments are converted to 'Error'.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:36:04 +01:00
Adrian McMenamin 74ee1a7590 cdrom: Add support for Sega Dreamcast GD-ROM.
This patch adds support for the GD-Rom drive, SEGA's proprietary
implementation of an IDE CD Rom for the SEGA Dreamcast. This driver
implements Sega's Packet Interface (SPI) - at least partially. It will
also read disks in SEGA's propreitary GD format.

Unlike previous drivers (which were never in mainline) this uses DMA and
not PIO to read disks. It is a new driver, not a refactoring of old
drivers.

Signed-off by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-01-28 13:19:04 +09:00
Greg Kroah-Hartman c10997f657 Kobject: convert drivers/* from kobject_unregister() to kobject_put()
There is no need for kobject_unregister() anymore, thanks to Kay's
kobject cleanup changes, so replace all instances of it with
kobject_put().


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:40 -08:00
Kay Sievers edfaa7c365 Driver core: convert block from raw kobjects to core devices
This moves the block devices to /sys/class/block. It will create a
flat list of all block devices, with the disks and partitions in one
directory. For compatibility /sys/block is created and contains symlinks
to the disks.

  /sys/class/block
  |-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
  |-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1
  |-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10
  |-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5
  |-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6
  |-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7
  |-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8
  |-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9
  `-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0

  /sys/block/
  |-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
  `-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:36 -08:00
Greg Kroah-Hartman 89c4260664 Kobject: change drivers/block/pktcdvd.c to use kobject_init_and_add
Stop using kobject_register, as this way we can control the sending of
the uevent properly, after everything is properly initialized.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:29 -08:00
Tony Jones 6013c12be8 pktcdvd: Convert from class_device to device for block/pktcdvd
struct class_device is going away, this converts the code to use struct
device instead.

Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:06 -08:00
Tony Jones aa27582614 paride: Convert from class_device to device for block/paride
struct class_device is going away, this converts the code to use struct
device instead.

Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Tim Waugh <tim@cyberelk.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:06 -08:00
Tony Jones 7ea7ed01ff aoechr: Convert from class_device to device
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Sam Hopkins <sah@coraid.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:05 -08:00
Randy Dunlap 7d1fd970e4 cciss: section mismatch
Mark cciss_pci_init() as __devinit, to fix section mismatch warning.

WARNING: vmlinux.o(.text+0x601fc9): Section mismatch: reference to .init.text: (between 'cciss_pci_init' and 'cciss_getgeometry')

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <mike.miller@hp.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-14 08:52:22 -08:00
Jens Axboe a24eab1ed5 loop: fix bad bio_alloc() nr_iovec request
Don't allocate room for an iovec when it is not needed.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-11 10:14:40 +01:00
Randy Dunlap 458cf5e9b6 Cleanup umem driver: fix most checkpatch warnings, conform to kernel
coding style.

  linux-2.6.24-rc5-git3> checkpatch.pl-next  patches/block-umem-ckpatch.patch
  total: 0 errors, 5 warnings, 530 lines checked

All of these are line-length warnings.

Only change in generated object file is due to not initializing a
static global variable to 0.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-12-18 08:29:28 +01:00
Dave Young d17a18dd92 pktcdvd: add kobject_put when kobject register fails
In kobject_register, the kobject reference is get in kobject_init, and then
kobject_add.  If kobject_add fail, it will only cleanup the reference got
by itself.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Greg KH <greg@kroah.com>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-17 19:28:16 -08:00
Neil Brown 794e64d5e9 Fix NULL dereference in umem.c
Fix NULL dereference in umem.c

Signed-off-by: Neil Brown <neilb@suse.de>
Tested-by: Dave Chinner <dgc@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-10 19:43:55 -08:00
Andrew Morton 43cbe2cbdd aoe: properly initialise the request_queue's backing_dev_info
AOE forgot to initialise its queue's backing_dev_info, so kernels crash.
(http://bugzilla.kernel.org/show_bug.cgi?id=9482)

Fix that and consoldate aoeblk_gdalloc()'s error handling.

Thanks be to Jon for reporting and testing.

Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: <stable@kernel.org>
Cc: "Jon Nelson" <jnelson@jamponi.net>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-10 19:43:54 -08:00
Rusty Russell 74b2553f1d virtio: fix module/device unloading
The virtio code never hooked through the ->remove callback.  Although
noone supports device removal at the moment, this code is already
needed for module unloading.

This of course also revealed bugs in virtio_blk, virtio_net and lguest
unloading paths.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-11-19 11:20:42 +11:00
Christian Borntraeger 5d0360ee96 rd: fix data corruption on memory pressure
We have seen ramdisk based install systems, where some pages of mapped
libraries and programs were suddendly zeroed under memory pressure.  This
should not happen, as the ramdisk avoids freeing its pages by keeping them
dirty all the time.

It turns out that there is a case, where the VM makes a ramdisk page clean,
without telling the ramdisk driver.  On memory pressure shrink_zone runs
and it starts to run shrink_active_list.  There is a check for
buffer_heads_over_limit, and if true, pagevec_strip is called.
pagevec_strip calls try_to_release_page.  If the mapping has no releasepage
callback, try_to_free_buffers is called.  try_to_free_buffers has now a
special logic for some file systems to make a dirty page clean, if all
buffers are clean.  Thats what happened in our test case.

The simplest solution is to provide a noop-releasepage callback for the
ramdisk driver.  This avoids try_to_free_buffers for ramdisk pages.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:42 -08:00
Ondrej Zary e62aa046e1 paride: pf driver fixes
The pf driver for parallel port floppy drives seems to be broken.  At least
with Imation SuperDisk with EPAT chip, the driver calls pi_connect() and
pi_disconnect after each transferred sector.  At least with EPAT, this
operation is very expensive - causes drive recalibration.  Thus, transferring
even a single byte (dd if=/dev/pf0 of=/dev/null bs=1 count=1) takes 20
seconds, making the driver useless.

The pf_next_buf() function seems to be broken as it returns 1 always (except
when pf_run is non-zero), causing the loop in do_pf_read_drq (and
do_pf_write_drq) to be executed only once.

The following patch fixes this problem.  It also fixes swapped descriptions in
pf_lock() function and removes DBMSG macro, which seems useless.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14 18:45:39 -08:00
Trond Myklebust 91cf45f02a [NET]: Add the helper kernel_sock_shutdown()
...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers.

Looking at the sock->op->shutdown() handlers, it looks as if all of them
take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the
RCV_SHUTDOWN/SEND_SHUTDOWN arguments.
Add a helper, and then define the SHUT_* enum to ensure that kernel users
of shutdown() don't get confused.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-12 18:10:39 -08:00
Tejun Heo fffe487d59 pktcdvd: fix BUG caused by sysfs module reference semantics change
pkt_setup_dev() expects module reference to be held on invocation.
This used to be true for sysfs callbacks but not anymore.  Test and
grab module reference around pkt_setup_dev() in
class_pktcdvd_store_add().

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-08 08:00:24 +01:00
Roel Kluin b07989f51e paride: fix 'and' typo in drivers/block/paride/pt.c
Fix 'and' typo (PT_WRITE_OK is defined 2)

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-05 15:12:32 -08:00
Mike Miller bd4f36d6da cciss: update copyright notices
This patch updates the copyright information for the cciss driver. It
includes extending the year to 2007 (how timely) and some minor corrections
deemed necessary by HP legal and the Open Source Review Board. Please
consider this patch for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
--------------------------------------------------------------------------------
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-29 11:33:05 +01:00
FUJITA Tomonori 4f33a9d9a4 ub: add sg_init_table for sense and read capacity commands
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-25 09:17:03 +02:00
Jens Axboe 3d1266c704 SG: audit of drivers that use blk_rq_map_sg()
They need to properly init the sg table, or blk_rq_map_sg() will
complain if CONFIG_DEBUG_SG is set.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-24 13:21:21 +02:00
Jens Axboe 642f149031 SG: Change sg_set_page() to take length and offset argument
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.

Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-24 11:20:47 +02:00
David Miller d91c5e8839 More SG build fixes
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2007-10-24 08:46:01 +02:00
Ralf Baechle 117636092a [PATCH] Fix breakage after SG cleanups
Commits

  58b053e4ce ("Update arch/ to use sg helpers")
  45711f1af6 ("[SG] Update drivers to use sg helpers")
  fa05f1286b ("Update net/ to use sg helpers")

converted many files to use the scatter gather helpers without ensuring
that the necessary headerfile <linux/scatterlist> is included.  This
happened to work for ia64, powerpc, sparc64 and x86 because they
happened to drag in that file via their <asm/dma-mapping.h>.

On most of the others this probably broke.

Instead of increasing the header file spider web I choose to include
<linux/scatterlist.h> directly into the affectes files.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-23 12:02:39 -07:00
Rusty Russell 0ca49ca946 Remove old lguest bus and drivers.
This gets rid of the lguest bus, drivers and DMA mechanism, to make
way for a generic virtio mechanism.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23 15:49:55 +10:00
Rusty Russell e467cde238 Block driver using virtio.
The block driver uses scatter-gather lists with sg[0] being the
request information (struct virtio_blk_outhdr) with the type, sector
and inbuf id.  The next N sg entries are the bio itself, then the last
sg is the status byte.  Whether the N entries are in or out depends on
whether it's a read or a write.

We accept the normal (SCSI) ioctls: they get handed through to the other
side which can then handle it or reply that it's unsupported.  It's
not clear that this actually works in general, since I don't know
if blk_pc_request() requests have an accurate rq_data_dir().

Although we try to reply -ENOTTY on unsupported commands, ioctl(fd,
CDROMEJECT) returns success to userspace.  This needs a separate
patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <jens.axboe@oracle.com>
2007-10-23 15:49:54 +10:00
Jens Axboe 45711f1af6 [SG] Update drivers to use sg helpers
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-22 21:19:53 +02:00
Denis Cheng d489202ea2 remove unused return within void return function
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-20 02:18:21 +02:00
Adrian Bunk d96267ae46 remove duplicate MMAPPER Kconfig option
This option is already in arch/um/Kconfig.char

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Jeff Dike <jdike@addtoit.com>
2007-10-20 01:01:08 +02:00
Jan Engelhardt 96de0e252c Convert files to UTF-8 and some cleanups
* Convert files to UTF-8.

  * Also correct some people's names
    (one example is Eißfeldt, which was found in a source file.
    Given that the author used an ß at all in a source file
    indicates that the real name has in fact a 'ß' and not an 'ss',
    which is commonly used as a substitute for 'ß' when limited to
    7bit.)

  * Correct town names (Goettingen -> Göttingen)

  * Update Eberhard Mönkeberg's address (http://lkml.org/lkml/2007/1/8/313)

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-19 23:21:04 +02:00
Robert P. J. Day 3a4fa0a25d Fix misspellings of "system", "controller", "interrupt" and "necessary".
Fix the various misspellings of "system", controller", "interrupt" and
"[un]necessary".

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-19 23:10:43 +02:00
Patrick Ringl 2e977c85d7 fix typos in drivers/block/Kconfig
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-19 23:05:02 +02:00
Pavel Emelyanov ba25f9dcc4 Use helpers to obtain task pid in printks
The task_struct->pid member is going to be deprecated, so start
using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
the kernel.

The first thing to start with is the pid, printed to dmesg - in
this case we may safely use task_pid_nr(). Besides, printks produce
more (much more) than a half of all the explicit pid usage.

[akpm@linux-foundation.org: git-drm went and changed lots of stuff]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:43 -07:00
Joe Perches 898eb71cb1 Add missing newlines to some uses of dev_<level> messages
Found these while looking at printk uses.

Add missing newlines to dev_<level> uses
Add missing KERN_<level> prefixes to multiline dev_<level>s
Fixed a wierd->weird spelling typo
Added a newline to a printk

Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Smart <James.Smart@Emulex.Com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:28 -07:00
Linus Torvalds b6257a9036 Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
  [SCSI] Remove full sg table memset()
  [SCSI] ide-scsi: remove usage of sg_last()
  Fix loop terminating conditions in fill_sg().
  [BLOCK] Clear sg entry before filling in blk_rq_map_sg()
  IA64: iommu uses sg_next with an invalid sg element
  cciss: disable DMA refetch on Smart Array P600
  swiotlb: fix map_sg failure handling
  SPARC64: fix iommu sg chaining
  [SCSI] ide-scsi: use scsi_sg_count() instead of ->use_sg
2007-10-17 09:08:13 -07:00
Jesper Juhl fdc1ca8aba floppy: remove register keyword use from floppy driver
The floppy drive is slow.  These days I see absolutely no good reason why the
floppy driver should try to gain a tiny bit of speed by telling gcc to
optimize access to some variables via the register keyword.  Better to just
leave gcc free to do whatever optimizations it deduces to be sane and not
hamper it by telling it that some variables in the floppy driver are special
and need to be fast (they don't).

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:03 -07:00
Jesper Juhl aee9041c5f floppy: remove dead/commented out code from floppy driver
A good initial step for a cleanup seems to me to be getting rid of old dead
code.  This stuff is either commented out or inside '#if 0' so it is not
currently in use at all, let's just get rid of it once and for all.  That's a
few lines less to deal with.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:03 -07:00
Jesper Juhl 06f748c475 floppy: do a very minimal style cleanup of the floppy driver
Yes, some of this will likely be replaced in later patches, but I do not see
anyone else coming out of the woodwork with any patches for this driver, so
I'll ignore comments about churn.  I want to get this driver cleaned up, and
if I'm going to do so I want to start with this basic style cleanup to reduce
the reading pain a bit.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:03 -07:00
Robert P. J. Day fac8b209b1 Remove final traces of long-deprecated "ramdisk" kernel parm
Since the "ramdisk" kernel parameter has been officially deprecated
since at least 2.6.18, might as well finally get rid of it.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:56 -07:00
Adrian Bunk 5a9df732b6 drivers/block/cciss.c: fix check-after-use
The Coverity checker spotted that we have already oops'ed if "disk"
was NULL.

Since "disk" being NULL seems impossible at this point this patch
removes the NULL check.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Steve Cameron 1a614f5051 cciss: fix error reporting for SG_IO
This fixes a problem with the way cciss was filling out the "errors" field
of the request structure upon completion of requests.  Previously, it just
put a 1 or a 0 in there and used the negation of this as the uptodate
parameter to one of the functions in the block layer, being a block device.
 For the SG_IO ioctl, this was not sufficient, and we noticed that, for
example, sg_turs from sg3_utils did not correctly detect problems due to
cciss having set rq->errors incorrectly.

Signed-off-by: Stephen M. Cameron <steve.cameron@hp.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Paul Clements 7fdfd4065c NBD: allow hung network I/O to be cancelled
Allow NBD I/O to be cancelled when a network outage occurs.  Previously, I/O
would just hang, and if enough I/O was hung in nbd, the system (at least
user-level) would completely hang until a TCP timeout (default, 15 minutes)
occurred.

The patch introduces a new ioctl NBD_SET_TIMEOUT that allows a transmit
timeout value (in seconds) to be specified.  Any network send that exceeds the
timeout will be cancelled and the nbd connection will be shut down.  I've
tested with various timeout values and 6 seconds seems to be a good choice for
the timeout.  If the NBD_SET_TIMEOUT ioctl is not called, you get the old (I/O
hang) behavior.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Paul Clements 4b86a87256 NBD: set uninitialized devices to size 0
This fixes errors with utilities (such as LVM's vgscan) that try to scan all
devices.  Previously this would generate read errors when uninitialized nbd
devices were scanned:

# vgscan
   Reading all physical volumes.  This may take a while...
   /dev/nbd0: read failed after 0 of 1024 at 0: Input/output error
   /dev/nbd0: read failed after 0 of 1024 at 509804544: Input/output error
   /dev/nbd0: read failed after 0 of 2048 at 0: Input/output error
   /dev/nbd1: read failed after 0 of 1024 at 509804544: Input/output error
   /dev/nbd1: read failed after 0 of 2048 at 0: Input/output error

 From now on, uninitialized nbd devices will have size zero, which
prevents these errors.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Jan Beulich 2e9c47cd4d floppy: tolerate DMA channel unavailability
The floppy driver is already written to be able to operate in virtual DMA
mode.  Thus it can easily be adjusted to tolerate failure from
fd_request_dma() as long as virtual DMA mode is not disallowed.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Ed L. Cashin abdbf94d7c aoe: remove unecessary wrapper function
We can just use skb_mac_header now, and we don't need a wrapper function to
perform the cast.  Instead of requiring the reader to check aoe.h to look
up what an aoe_hdr function does, I'd rather do without it.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:52 -07:00
Diego Woitasen 759d7c6c47 Remove unneeded lock_kernel() in driver/block/loop.c
Signed-off-by: Diego Woitasen <diego@woitasen.com.ar>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:48 -07:00
Denis Cheng 0cbc591bf8 nbd: change a parameter's type to remove a memcpy call
This memcpy looks so strange, in fact it's merely a pointer dereference, so I
change the parameter's type to refer it more directly, this could make the
memcpy not needed anymore.

In the function nbd_read_stat where nbd_find_request is only once called, the
parameter served should be transformed accordingly.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:47 -07:00
Denis Cheng d2c9740b49 nbd: use list_for_each_entry_safe to make it more consolidated and readable
Thus the traverse of the loop may delete nodes, use the safe version.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:47 -07:00
Peter Zijlstra e0bf68ddec mm: bdi init hooks
provide BDI constructor/destructor hooks

[akpm@linux-foundation.org: compile fix]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:45 -07:00
Mike Miller (OS Dev) 8bf50f71cb cciss: disable DMA refetch on Smart Array P600
This patch disables DMA refetch in the PCI bridge. We have disabled DMA
prefetch for quite some time. Testing with XEN revealed another ASIC bug. If
dom0 resides on a P600 the board can can an MCA bi accessing invalid memory
addresses. Apparently, we need to disable both prefetch and refetch.
My understanding is a refetch operation should not occur but it is a valid
thing to do if prefetched data is no longer available for whatever reason.
Please consider this patch for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>

--------------------------------------------------------------------------------
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-17 10:10:04 +02:00
Linus Torvalds 92d15c2ccb Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block: (63 commits)
  Fix memory leak in dm-crypt
  SPARC64: sg chaining support
  SPARC: sg chaining support
  PPC: sg chaining support
  PS3: sg chaining support
  IA64: sg chaining support
  x86-64: enable sg chaining
  x86-64: update pci-gart iommu to sg helpers
  x86-64: update nommu to sg helpers
  x86-64: update calgary iommu to sg helpers
  swiotlb: sg chaining support
  i386: enable sg chaining
  i386 dma_map_sg: convert to using sg helpers
  mmc: need to zero sglist on init
  Panic in blk_rq_map_sg() from CCISS driver
  remove sglist_len
  remove blk_queue_max_phys_segments in libata
  revert sg segment size ifdefs
  Fixup u14-34f ENABLE_SG_CHAINING
  qla1280: enable use_sg_chaining option
  ...
2007-10-16 10:09:16 -07:00
Dmitry Monakhov 8268f5a741 deny partial write for loop dev fd
Partial write can be easily supported by LO_CRYPT_NONE mode, but it is not
easy in LO_CRYPT_CRYPTOAPI case, because of its block nature.  I don't know
who still used cryptoapi, but theoretically it is possible.  So let's leave
things as they are.  Loop device doesn't support partial write before
Nick's "write_begin/write_end" patch set, and let's it behave the same way
after.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Nick Piggin afddba49d1 fs: introduce write_begin, write_end, and perform_write aops
These are intended to replace prepare_write and commit_write with more
flexible alternatives that are also able to avoid the buffered write
deadlock problems efficiently (which prepare_write is unable to do).

[mark.fasheh@oracle.com: API design contributions, code review and fixes]
[akpm@linux-foundation.org: various fixes]
[dmonakhov@sw.ru: new aop block_write_begin fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Jens Axboe 3eed13fd93 Merge branch 'sglist-arch' into for-linus 2007-10-16 12:29:34 +02:00
Jens Axboe a39d113936 Merge branch 'barrier' into for-linus 2007-10-16 12:29:29 +02:00
Lee Schermerhorn a683d652d3 Panic in blk_rq_map_sg() from CCISS driver
New scatter/gather list chaining [sg_next()] treats 'page' member of
struct scatterlist with low bit set [0x01] as a chain pointer to
another struct scatterlist [array].  The CCISS driver request function
passes an uninitialized, temporary, on-stack scatterlist array to
blk_rq_map_sq().  sg_next() interprets random data on the stack as a
chain pointer and eventually tries to de-reference an invalid pointer,
resulting in:

[<ffffffff8031dd70>] blk_rq_map_sg+0x70/0x170
PGD 6090c3067 PUD 0
Oops: 0000 [1] SMP
last sysfs file: /block/cciss!c0d0/cciss!c0d0p1/dev
CPU 6
Modules linked in: ehci_hcd ohci_hcd uhci_hcd
Pid: 1, comm: init Not tainted 2.6.23-rc6-mm1 #3
RIP: 0010:[<ffffffff8031dd70>] [<ffffffff8031dd70>] blk_rq_map_sg+0x70/0x170
RSP: 0018:ffff81060901f768 EFLAGS: 00010206
RAX: 000000040b161000 RBX: ffff81060901f7d8 RCX: 000000040b162c00
RDX: 0000000000000000 RSI: ffff81060b13a260 RDI: ffff81060b139600
RBP: 0000000000001400 R08: 00000000fffffffe R09: 0000000000000400
R10: 0000000000000000 R11: 000000040b163000 R12: ffff810102fe0000
R13: 0000000000000001 R14: 0000000000000001 R15: 00001e0000000000
FS: 00000000026108f0(0063) GS:ffff810409000b80(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000010000001e CR3: 00000006090c6000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process init (pid: 1, threadinfo ffff81060901e000, task ffff810409020800)
last branch before last exception/interrupt
from [<ffffffff8031de0a>] blk_rq_map_sg+0x10a/0x170
to [<ffffffff8031dd70>] blk_rq_map_sg+0x70/0x170
Stack: 000000018068ea00 ffff810102fe0000 0000000000000000 ffff810011400000
0000000000000002 0000000000000000 ffff81040b172000 ffffffff803acd3d
0000000000003ec1 ffff8106090d5000 ffff8106090d5000 ffff810102fe0000
Call Trace:
[<ffffffff803acd3d>] do_cciss_request+0x15d/0x4c0
[<ffffffff80298968>] new_slab+0x1c8/0x270
[<ffffffff80298ffd>] __slab_alloc+0x22d/0x470
[<ffffffff8027327b>] mempool_alloc+0x4b/0x130
[<ffffffff8032b21e>] cfq_set_request+0xee/0x380
[<ffffffff8027327b>] mempool_alloc+0x4b/0x130
[<ffffffff8031ff98>] get_request+0x168/0x360
[<ffffffff80331b0d>] rb_insert_color+0x8d/0x110
[<ffffffff8031cfd8>] elv_rb_add+0x58/0x60
[<ffffffff8032a329>] cfq_add_rq_rb+0x69/0xa0
[<ffffffff8031c1ab>] elv_merged_request+0x5b/0x60
[<ffffffff803224fd>] __make_request+0x23d/0x650
[<ffffffff80298ffd>] __slab_alloc+0x22d/0x470
[<ffffffff80270000>] generic_write_checks+0x140/0x190
[<ffffffff8031f012>] generic_make_request+0x1c2/0x3a0
<etc>
Kernel panic - not syncing: Attempted to kill init!

This patch initializes the tmp_sg array to zeroes.  Perhaps not the ultimate
fix, but an effective work-around.  I can now boot 23-rc6-mm1 on an HP
Proliant x86_64 with CCISS boot disk.

Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>

 drivers/block/cciss.c |    1 +
 1 file changed, 1 insertion(+)
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:24:44 +02:00
Laurent Riffard 7e3da6c4b9 pktcdvd: don't rely on bio_init() preserving bio->bi_destructor
Signed-off-by: Laurent Riffard <laurent.riffard@free.fr>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:05:09 +02:00
Jens Axboe 761a15e7ac pktcdvd: don't rely on bio_init() preserving bio->bi_io_vec
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:05:08 +02:00
Jens Axboe fd5d806266 block: convert blkdev_issue_flush() to use empty barriers
Then we can get rid of ->issue_flush_fn() and all the driver private
implementations of that.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:05:02 +02:00
Jeff Garzik 87ad900164 drivers/block/cpqarray,cciss: kill unused var
The recent bio work and subsequent fixups created unused variables.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 09:59:55 +02:00
Al Viro b4482a4b2e more trivial signedness fixes in drivers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-14 12:41:52 -07:00
Robert P. J. Day cebe0fe70f [S390] Remove obsolete recommendation for 8M ramdisk size.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-10-12 16:13:09 +02:00
Linus Torvalds e86908614f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (408 commits)
  [POWERPC] Add memchr() to the bootwrapper
  [POWERPC] Implement logging of unhandled signals
  [POWERPC] Add legacy serial support for OPB with flattened device tree
  [POWERPC] Use 1TB segments
  [POWERPC] XilinxFB: Allow fixed framebuffer base address
  [POWERPC] XilinxFB: Add support for custom screen resolution
  [POWERPC] XilinxFB: Use pdata to pass around framebuffer parameters
  [POWERPC] PCI: Add 64-bit physical address support to setup_indirect_pci
  [POWERPC] 4xx: Kilauea defconfig file
  [POWERPC] 4xx: Kilauea DTS
  [POWERPC] 4xx: Add AMCC Kilauea eval board support to platforms/40x
  [POWERPC] 4xx: Add AMCC 405EX support to cputable.c
  [POWERPC] Adjust TASK_SIZE on ppc32 systems to 3GB that are capable
  [POWERPC] Use PAGE_OFFSET to tell if an address is user/kernel in SW TLB handlers
  [POWERPC] 85xx: Enable FP emulation in MPC8560 ADS defconfig
  [POWERPC] 85xx: Killed <asm/mpc85xx.h>
  [POWERPC] 85xx: Add cpm nodes for 8541/8555 CDS
  [POWERPC] 85xx: Convert mpc8560ads to the new CPM binding.
  [POWERPC] mpc8272ads: Remove muram from the CPM reg property.
  [POWERPC] Make clockevents work on PPC601 processors
  ...

Fixed up conflict in Documentation/powerpc/booting-without-of.txt manually.
2007-10-11 21:55:47 -07:00
Linus Torvalds 038a5008b2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (867 commits)
  [SKY2]: status polling loop (post merge)
  [NET]: Fix NAPI completion handling in some drivers.
  [TCP]: Limit processing lost_retrans loop to work-to-do cases
  [TCP]: Fix lost_retrans loop vs fastpath problems
  [TCP]: No need to re-count fackets_out/sacked_out at RTO
  [TCP]: Extract tcp_match_queue_to_sack from sacktag code
  [TCP]: Kill almost unused variable pcount from sacktag
  [TCP]: Fix mark_head_lost to ignore R-bit when trying to mark L
  [TCP]: Add bytes_acked (ABC) clearing to FRTO too
  [IPv6]: Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2
  [NETFILTER]: x_tables: add missing ip6t_modulename aliases
  [NETFILTER]: nf_conntrack_tcp: fix connection reopening
  [QETH]: fix qeth_main.c
  [NETLINK]: fib_frontend build fixes
  [IPv6]: Export userland ND options through netlink (RDNSS support)
  [9P]: build fix with !CONFIG_SYSCTL
  [NET]: Fix dev_put() and dev_hold() comments
  [NET]: make netlink user -> kernel interface synchronious
  [NET]: unify netlink kernel socket recognition
  [NET]: cleanup 3rd argument in netlink_sendskb
  ...

Fix up conflicts manually in Documentation/feature-removal-schedule.txt
and my new least favourite crap, the "mod_devicetable" support in the
files include/linux/mod_devicetable.h and scripts/mod/file2alias.c.

(The latter files seem to be explicitly _designed_ to get conflicts when
different subsystems work with them - that have an absolutely horrid
lack of subsystem separation!)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-11 19:40:14 -07:00
Stephen Rothwell 8251b4c481 [POWERPC] iSeries: Move viodasd probing
This way we only have entries in the device tree for disks that actually
exist.  A slight complication is that disks may be attached to LPARs
at runtime.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-11 20:40:48 +10:00
Eric W. Biederman 881d966b48 [NET]: Make the device list and device lookups per namespace.
This patch makes most of the generic device layer network
namespace safe.  This patch makes dev_base_head a
network namespace variable, and then it picks up
a few associated variables.  The functions:
dev_getbyhwaddr
dev_getfirsthwbytype
dev_get_by_flags
dev_get_by_name
__dev_get_by_name
dev_get_by_index
__dev_get_by_index
dev_ioctl
dev_ethtool
dev_load
wireless_process_ioctl

were modified to take a network namespace argument, and
deal with it.

vlan_ioctl_set and brioctl_set were modified so their
hooks will receive a network namespace argument.

So basically anthing in the core of the network stack that was
affected to by the change of dev_base was modified to handle
multiple network namespaces.  The rest of the network stack was
simply modified to explicitly use &init_net the initial network
namespace.  This can be fixed when those components of the network
stack are modified to handle multiple network namespaces.

For now the ifindex generator is left global.

Fundametally ifindex numbers are per namespace, or else
we will have corner case problems with migration when
we get that far.

At the same time there are assumptions in the network stack
that the ifindex of a network device won't change.  Making
the ifindex number global seems a good compromise until
the network stack can cope with ifindex changes when
you change namespaces, and the like.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:10 -07:00
Eric W. Biederman e730c15519 [NET]: Make packet reception network namespace safe
This patch modifies every packet receive function
registered with dev_add_pack() to drop packets if they
are not from the initial network namespace.

This should ensure that the various network stacks do
not receive packets in a anything but the initial network
namespace until the code has been converted and is ready
for them.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:08 -07:00
Grant Likely d2bbf3da37 Sysace: Don't enable IRQ until after interrupt handler is registered
The previous patch to move the interrupt handler registration moved it
below enabling interrupts which could be a problem if the device is on
a shared interrupt line.  This patch fixes the order.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:26:00 +02:00
Grant Likely b5515d86f2 Sysace: sparse fixes
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:59 +02:00
Grant Likely 34e1b83413 Sysace: Minor coding convention fixup
Put function call and return code test on separate lines.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:59 +02:00
Jeff Garzik cb3503ca54 drivers/block/umem: use DRIVER_NAME where appropriate
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 09:25:59 +02:00
Jeff Garzik 4e953a2162 drivers/block/umem: trim trailing whitespace
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 09:25:59 +02:00
Jeff Garzik ee4a7b6874 drivers/block/umem: minor cleanups
* tab-align DRIVER_*, pci_driver entries

* reduced wasted memory by killing unused struct cardinfo members

* move free_irq() call above resource unmap, to fix tiny window where
  irq handler may access recently-unmapped memory

* propagate pci_enable_device() return value

* use pci_request_regions, pci_release_regions() for resource reservation

* call pci_disable_device() in pci_driver::remove()

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 09:25:59 +02:00
Jeff Garzik 4e0af881af drivers/block/umem: use dev_printk()
dev_printk() gives us a consistent prefix (driver name + PCI bus id),
which allows us to eliminate the hand-rolled one.

Also allows us to eliminate card->card_number, which was used solely in
printk() calls.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 09:25:59 +02:00
Jeff Garzik 3084f0c610 drivers/block/umem: move private include away from include/linux
Move include/linux/umem.h to drivers/block, as umem.c is the only user,
and its not an exported header.

Move the PCI_{VENDOR,DEVICE}_ID_* constants to include/linux/pci_ids.h.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 09:25:59 +02:00
Grant Likely ed155a95a4 Sysace: Labels in C code should not be indented.
Remove the indentation on labels

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:59 +02:00
Grant Likely 95e896c35f Sysace: Add of_platform_bus binding
The of_platform bus binding is needed to make the device driver usable
under arch/powerpc.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:59 +02:00
Grant Likely 32f6fff47d Sysace: Move IRQ handler registration to occur after FSM is initialized
The FSM needs to be initialized before it is safe to call the ISR

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:58 +02:00
Grant Likely 4a24d8610d Sysace: minor rework and cleanup changes
Miscellanious rework to the sysace driver; Not critical, but makes the
subsequent addition of the of_platform bus binding a wee bit cleaner

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:58 +02:00
Grant Likely 1b45546654 Sysace: Move structure allocation from bus binding into common code
Split the determination of device registers/irqs/etc from the actual
allocation and initialization of the device structure.  This cleans
up the code a bit in preparation to add an of_platform bus binding

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:58 +02:00
Grant Likely edec49616c Sysace: Use the established platform bus api
SystemACE uses the platform bus binding, but it doesn't use the
platform bus API.  Move to using the correct API for consistency
sake and future proofing against platform bus changes.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:57 +02:00
NeilBrown 6712ecf8f6 Drop 'size' argument from bio_endio and bi_end_io
As bi_end_io is only called once when the reqeust is complete,
the 'size' argument is now redundant.  Remove it.

Now there is no need for bio_endio to subtract the size completed
from bi_size.  So don't do that either.

While we are at it, change bi_end_io to return void.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:57 +02:00
Jens Axboe 6c92e699b5 Fixup rq_for_each_segment() indentation
Remove one level of nesting where appropriate.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:56 +02:00
NeilBrown eea9befacc Fix various abuse of bio fields in umem.c
umem.c:
  advances bi_idx and bi_sector to track where it is up to.
   But it is only ever doing this on one bio, so the updated
   fields can easily be kept elsewhere (current_*).
  updates bi_size, but never uses the updated values, so
   this isn't needed.
  reuses bi_phys_segments to count how many iovecs have been
   completely.  As the completion happens sequentiually, we
   can store this information outside the bio too.

Signed-off-by: Neil Brown <neilb@suse.de>

diff .prev/drivers/block/umem.c ./drivers/block/umem.c
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:56 +02:00
NeilBrown 5705f70217 Introduce rq_for_each_segment replacing rq_for_each_bio
Every usage of rq_for_each_bio wraps a usage of
bio_for_each_segment, so these can be combined into
rq_for_each_segment.

We define "struct req_iterator" to hold the 'bio' and 'index' that
are needed for the double iteration.

Signed-off-by: Neil Brown <neilb@suse.de>

Various compile fixes by me...

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:56 +02:00
Andrew Morton 3558c9b323 Fix "Fix DAC960 driver on machines which don't support 64-bit DMA"
sparc32:

drivers/block/DAC960.c: In function 'DAC960_V1_EnableMemoryMailboxInterface':
drivers/block/DAC960.c:1168: error: 'DMA_32BIT_MASK' undeclared (first use in this function)
drivers/block/DAC960.c:1168: error: (Each undeclared identifier is reported only

Cc: <dac@conglom-o.org>
Cc: <stable@kernel.org>
Cc: Alessandro Polverini <alex@nibbles.it>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19 11:24:16 -07:00
Matthew Wilcox 868047fcbb Fix DAC960 driver on machines which don't support 64-bit DMA
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=8942

Use PCI_DMA_* constants instead of own private definitions Fall back to
32-bit DMA mask if a 64-bit one fails

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Acked-by: Jeff Garzik <jeff@garzik.org>
Tested-by: Lars <polynomial-c@gmx.de>
Cc: Alessandro Polverini <alex@nibbles.it>
Cc: <dac@conglom-o.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-11 17:21:19 -07:00
David S. Miller 1bd4b28039 [SUNVDC]: Use slice 0xff on VD_DISK_TYPE_DISK.
While debugging issues with the VDS server I made the
driver use partition 2 to get at the whole disk since
this is the "whole disk" partition in the Sun disk
label.

We really should use slice 0xff which really means
the whole physical disk in the VIO disk protocol.
Otherwise things won't work well on a disk image
that doesn't have a proper disk label on it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-08-26 18:49:07 -07:00
Geert Uytterhoeven 928923c76b Introduce CONFIG_CHECK_SIGNATURE
Introduce CONFIG_CHECK_SIGNATURE to control inclusion of check_signature()
and avoid problems on platforms that don't have readb().

Let the few legacy (ISA || PCI || X86) drivers that need check_signature()
select CONFIG_CHECK_SIGNATURE.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-22 19:52:45 -07:00
Jan Engelhardt 06bfb7eb15 Add some help texts to recently-introduced kconfig items
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (edited MACINTOSH_DRIVERS per Geert Uytterhoeven's remark)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-18 09:52:50 -07:00
Rusty Russell 9ef7ad2296 Enable partitions for lguest block device
The lguest block device only requests one minor, which means
partitions don't work (eg "root=/dev/lgba1").

Let's follow the crowd and ask for 16.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-18 09:45:51 -07:00
Christoph Hellwig a6b3a93e15 sysace: HDIO_GETGEO has it's own method for ages
The way this driver tries to implement HDIO_GETGEO it'll never be called.
Then again on ppc it probably will never be called anyway because it's
utterly pointless.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-08-11 22:34:48 +02:00
Mariusz Kozlowski 2e4934aa45 drivers/block/cpqarray.c: better error handling and kmalloc + memset conversion to k[cz]alloc
This patch removes some redundant casts, does the kmalloc + memset to
k[cz]alloc conversion and it changes the error path to use goto (to avoid code
duplication).

 drivers/block/cpqarray.c | 49567 -> 48623 (-944 bytes)
 drivers/block/cpqarray.o | 178820 -> 178288 (-532 bytes)

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-08-11 22:34:48 +02:00
Mariusz Kozlowski 1aebe18787 drivers/block/cciss.c: kmalloc + memset conversion to kzalloc
drivers/block/cciss.c | 104285 -> 104168 (-117 bytes)
 drivers/block/cciss.o | 277400 -> 277124 (-276 bytes)

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-08-11 22:34:48 +02:00
Jesper Juhl 9b99628f8e Clean up duplicate includes in drivers/block/
This patch cleans up duplicate includes in
	drivers/block/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Satyam Sharma <satyam.sharma@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-08-11 22:34:48 +02:00
Jesper Juhl f2912a1223 cciss: fix memory leak
There's a memory leak in the cciss driver.

in alloc_cciss_hba() we may leak sizeof(ctlr_info_t) bytes if a
call to alloc_disk(1 << NWD_SHIFT) fails.
This patch should fix the issue.

Spotted by the Coverity checker.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:43 -07:00
Rusty Russell 05ff09706b Make lguest compile with CONFIG_BLOCK=n and CONFIG_NET=n
Gabriel C reports lguest doesn't compile with CONFIG_BLOCK=n.  Fix this
by introducing a config var for the block device, which depends on
LGUEST && BLOCK.  Do the same for the net driver, rather then depending
gratuitously on CONFIG_NET.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Gabriel C <nix.or.die@googlemail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-29 17:37:45 -07:00
Rusty Russell e2c9784325 lguest: documentation III: Drivers
Documentation: The Drivers

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:35:17 -07:00
Jens Axboe 165125e1e4 [BLOCK] Get rid of request_queue_t typedef
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-24 09:28:11 +02:00
Geert Uytterhoeven c6131fa528 ps3: Disk Storage Driver
Add a Disk Storage Driver for the PS3:
  - Implemented as a block device driver with a dynamic major
  - Disk names (and partitions) are of the format ps3d%c(%u)
  - Uses software scatter-gather with a 64 KiB bounce buffer as the hypervisor
    doesn't support scatter-gather

Cc: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:16 -07:00
David S. Miller 91ba3c2128 [SPARC64]: Fix handling of multiple vdc-port nodes.
The "id" property in vdc-port nodes are not unique, they
are all zero.  Therefore assign ID's using the parent's
"cfg-handle" property which will be unique.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:27:18 -07:00
Fabio Massimo Di Nitto da68e0814a [SPARC64]: Fix MODULE_DEVICE_TABLE() specification in VDC and VNET.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:26:57 -07:00
Paul Mundt 20c2df83d2 mm: Remove slab destructors from kmem_cache_create().
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-20 10:11:58 +09:00
Rusty Russell b754416bfe lguest: the block driver
Lguest block driver

A simple block driver for lguest.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:53 -07:00
Yoann Padioleau dd00cc486a some kmalloc/memset ->kzalloc (tree wide)
Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).

Here is a short excerpt of the semantic patch performing
this transformation:

@@
type T2;
expression x;
identifier f,fld;
expression E;
expression E1,E2;
expression e1,e2,e3,y;
statement S;
@@

 x =
- kmalloc
+ kzalloc
  (E1,E2)
  ...  when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
- memset((T2)x,0,E1);

@@
expression E1,E2,E3;
@@

- kzalloc(E1 * E2,E3)
+ kcalloc(E1,E2,E3)

[akpm@linux-foundation.org: get kcalloc args the right way around]
Signed-off-by: Yoann Padioleau <padator@wanadoo.fr>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Bryan Wu <bryan.wu@analog.com>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Acked-by: Roland Dreier <rolandd@cisco.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Acked-by: Pierre Ossman <drzeus-list@drzeus.cx>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Greg KH <greg@kroah.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:50 -07:00
Linus Torvalds 31bdc5dc76 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Set vio->desc_buf to NULL after freeing.
  [SPARC]: Mark sparc and sparc64 as not having virt_to_bus
  [SPARC64]: Fix reset handling in VNET driver.
  [SPARC64]: Handle reset events in vio_link_state_change().
  [SPARC64]: Handle LDC resets properly in domain-services driver.
  [SPARC64]: Massively simplify VIO device layer and support hot add/remove.
  [SPARC64]: Simplify VNET probing.
  [SPARC64]: Simplify VDC device probing.
  [SPARC64]: Add basic infrastructure for MD add/remove notification.
2007-07-18 10:23:37 -07:00
Jeremy Fitzhardinge 9f27ee5950 xen: add virtual block device driver.
The block device frontend driver allows the kernel to access block
devices exported exported by a virtual machine containing a physical
block device driver.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <axboe@kernel.dk>
2007-07-18 08:47:45 -07:00
David S. Miller 80dc35dfb9 [SPARC64]: Simplify VDC device probing.
We just need to match on the vdc-port nodes, the parent
is really not interesting at all.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:19:55 -07:00
Akinobu Mita c6d4d63489 unregister_blkdev(): delete redundant message
No need to warn unregister_blkdev() failure by caller.  (The previous patch
makes unregister_blkdev() print error message in error case)

Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:03 -07:00
Akinobu Mita 00d59405cf unregister_blkdev() delete redundant messages in callers
No need to warn unregister_blkdev() failure by the callers.  (The previous
patch makes unregister_blkdev() print error message in error case)

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:03 -07:00
Grant Likely 74489a91dd Add support for Xilinx SystemACE CompactFlash interface
Tested on Xilinx Virtex ppc405, Katmai 440SPe, and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Stefan Roese <sr@denx.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John William <jwilliams@itee.uq.edu.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
Rafael J. Wysocki 8314418629 Freezer: make kernel threads nonfreezable by default
Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves.  This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie.  to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear.  Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
Linus Torvalds 489de30259 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (209 commits)
  [POWERPC] Create add_rtc() function to enable the RTC CMOS driver
  [POWERPC] Add H_ILLAN_ATTRIBUTES hcall number
  [POWERPC] xilinxfb: Parameterize xilinxfb platform device registration
  [POWERPC] Oprofile support for Power 5++
  [POWERPC] Enable arbitary speed tty ioctls and split input/output speed
  [POWERPC] Make drivers/char/hvc_console.c:khvcd() static
  [POWERPC] Remove dead code for preventing pread() and pwrite() calls
  [POWERPC] Remove unnecessary #undef printk from prom.c
  [POWERPC] Fix typo in Ebony default DTS
  [POWERPC] Check for NULL ppc_md.init_IRQ() before calling
  [POWERPC] Remove extra return statement
  [POWERPC] pasemi: Don't auto-select CONFIG_EMBEDDED
  [POWERPC] pasemi: Rename platform
  [POWERPC] arch/powerpc/kernel/sysfs.c: Move NUMA exports
  [POWERPC] Add __read_mostly support for powerpc
  [POWERPC] Modify sched_clock() to make CONFIG_PRINTK_TIME more sane
  [POWERPC] Create a dummy zImage if no valid platform has been selected
  [POWERPC] PS3: Bootwrapper support.
  [POWERPC] powermac i2c: Use mutex
  [POWERPC] Schedule removal of arch/ppc
  ...

Fixed up conflicts manually in:

	Documentation/feature-removal-schedule.txt
	arch/powerpc/kernel/pci_32.c
	arch/powerpc/kernel/pci_64.c
	include/asm-powerpc/pci.h

and asked the powerpc people to double-check the result..
2007-07-16 17:58:08 -07:00
S.Çağlar Onur 9793c32667 Fix too few arguments to function `scsi_cmd_ioctl'
This corrects the following compile error introduced by the merge of the
new bsg layer in commit e245befce7af0a1e1347079ed62695b059594bd4:

  caglar@zangetsu linux-2.6 $ make
    CHK     include/linux/version.h
    CHK     include/linux/utsrelease.h
    CALL    scripts/checksyscalls.sh
    CHK     include/linux/compile.h
    LD      drivers/block/built-in.o
    CC [M]  drivers/block/cciss.o
  drivers/block/cciss.c: In function `cciss_ioctl':
  drivers/block/cciss.c:1173: warning: passing arg 2 of `scsi_cmd_ioctl' from incompatible pointer type
  drivers/block/cciss.c:1173: warning: passing arg 3 of `scsi_cmd_ioctl' makes pointer from integer without a cast
  drivers/block/cciss.c:1173: warning: passing arg 4 of `scsi_cmd_ioctl' makes integer from pointer without a cast
  drivers/block/cciss.c:1173: error: too few arguments to function `scsi_cmd_ioctl'
  ...
  make[2]: *** [drivers/block/cciss.o] Hata 1
  make[1]: *** [drivers/block] Hata 2
  make: *** [drivers] Hata 2

Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 12:11:27 -07:00
Linus Torvalds e245befce7 Merge branch 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block
* 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block: (25 commits)
  bsg: Kconfig updates
  bsg: add SCSI transport-level request support
  bsg: add bidi support
  add a struct request pointer to the request structure
  bsg: fix the deadlock on discarding done commands
  bsg: fix a blocking read bug
  bsg: minor bug fixes
  improve bsg device allocation
  bind bsg to all SCSI devices
  bsg: bind bsg to request_queue instead of gendisk
  bsg: add a request_queue argument to scsi_cmd_ioctl()
  bsg: simplify __bsg_alloc_command failpath
  bsg: add cheasy error checks for sysfs stuff
  Add queue resizing support
  Replace s32, u32 and u64 with __s32, __u32 and __u64 in bsg.h for userspace
  bsg: silence a bogus gcc warning
  bsg: style cleanup
  bsg: use u32 etc instead of uint32_t
  bsg: add SG_IO to SG v4
  bsg: replace SG v3 with SG v4
  ...
2007-07-16 10:50:19 -07:00
Linus Torvalds 14dc524972 Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
  splice: direct splicing updates ppos twice
  more ACSI removal
  umem: Fix match of pci_ids in umem driver
  umem: Remove references to dead CONFIG_MM_MAP_MEMORY variable
  remove the documentation for the legacy CDROM drivers
2007-07-16 10:48:20 -07:00
Linus Torvalds 02b2318e07 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: (26 commits)
  [SPARC64]: Fix UP build.
  [SPARC64]: dr-cpu unconfigure support.
  [SERIAL]: Fix console write locking in sparc drivers.
  [SPARC64]: Give more accurate errors in dr_cpu_configure().
  [SPARC64]: Clear cpu_{core,sibling}_map[] in smp_fill_in_sib_core_maps()
  [SPARC64]: Fix leak when DR added cpu does not bootup.
  [SPARC64]: Add ->set_affinity IRQ handlers.
  [SPARC64]: Process dr-cpu events in a kthread instead of workqueue.
  [SPARC64]: More sensible udelay implementation.
  [SPARC64]: SMP build fixes.
  [SPARC64]: mdesc.c needs linux/mm.h
  [SPARC64]: Fix build regressions added by dr-cpu changes.
  [SPARC64]: Unconditionally register vio_bus_type.
  [SPARC64]: Initial LDOM cpu hotplug support.
  [SPARC64]: Fix setting of variables in LDOM guest.
  [SPARC64]: Fix MD property lifetime bugs.
  [SPARC64]: Abstract out mdesc accesses for better MD update handling.
  [SPARC64]: Use more mearningful names for IRQ registry.
  [SPARC64]: Initial domain-services driver.
  [SPARC64]: Export powerd facilities for external entities.
  ...
2007-07-16 10:45:23 -07:00
Oleg Nesterov be0ef957c9 nbd.c: sock_xmit: cleanup signal related code
sock_xmit() re-implements sigprocmask() and dequeue_signal_lock().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Oleg Nesterov 3e1ac130d0 kcdrwd: remove unneeded flush_signals() call
kcdrwd() is a kernel thread, all signals are ignored.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:50 -07:00
Stephen Rothwell f057eac0d7 Introduce CONFIG_VIRT_TO_BUS
Make some offending drivers depend on it and set CONFIG_ARCH_NO_VIRT_TO_BUS
for ppc64 so that we don't build those drivers.

This gets PowerPC allmodconfig and allyesconfig much closer to building.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Richard Knutsson 21eb92025e drivers/block/z2ram: Remove TRUE/FALSE defines
Remove defines of TRUE and FALSE
  * not used in the file
  * the file is not included somewhere else

Signed-off-by: Richard Knutsson <ricknu-0@student.ltu.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:42 -07:00
Adrian Bunk 56a68a500f more ACSI removal
This patch removes some code that became dead code after the ATARI_ACSI
removal.

It also indirectly fixes the following bug introduced by
commit c2bcf3b8978c291e1b7f6499475c8403a259d4d6:

 config ATARI_SLM
        tristate "Atari SLM laser printer support"
-       depends on ATARI && ATARI_ACSI!=n
+       depends on ATARI

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 15:02:47 +02:00
Neil Brown 5874c18b10 umem: Fix match of pci_ids in umem driver
the pci device list for umem was not using PCI_DEVICE, so the
subvendor/subdevice fields were not set to ANY, so matching
didn't work properly.

Change to use PCI_DEVICE.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 14:39:07 +02:00
Robert P. J. Day 51ea208c37 umem: Remove references to dead CONFIG_MM_MAP_MEMORY variable
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 14:39:06 +02:00
David S. Miller 43fdf27470 [SPARC64]: Abstract out mdesc accesses for better MD update handling.
Since we have to be able to handle MD updates, having an in-tree
set of data structures representing the MD objects actually makes
things more painful.

The MD itself is easy to parse, and we can implement the existing
interfaces using direct parsing of the MD binary image.

The MD is now reference counted, so accesses have to now take the
form:

	handle = mdesc_grab();

	... operations on MD ...

	mdesc_release(handle);

The only remaining issue are cases where code holds on to references
to MD property values.  mdesc_get_property() returns a direct pointer
to the property value, most cases just pull in the information they
need and discard the pointer, but there are few that use the pointer
directly over a long lifetime.  Those will be fixed up in a subsequent
changeset.

A preliminary handler for MD update events from domain services is
there, it is rudimentry but it works and handles all of the reference
counting.  It does not check the generation number of the MDs,
and it does not generate a "add/delete" list for notification to
interesting parties about MD changes but that will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:28 -07:00
David S. Miller 667ef3c396 [SPARC64]: Add Sun LDOM virtual disk driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:03:56 -07:00
FUJITA Tomonori 45e79a3acd bsg: add a request_queue argument to scsi_cmd_ioctl()
bsg uses scsi_cmd_ioctl() for some SCSI/sg ioctl
commands. scsi_cmd_ioctl() gets a request queue from a gendisk
arguement. This prevents bsg being bound to SCSI devices that don't
have a gendisk (like OSD). This adds a request_queue argument to
scsi_cmd_ioctl(). The SCSI/sg ioctl commands doesn't use a gendisk so
it's safe for any SCSI devices to use scsi_cmd_ioctl().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-16 08:52:45 +02:00
Linus Torvalds bc06cffdec Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (166 commits)
  [SCSI] ibmvscsi: convert to use the data buffer accessors
  [SCSI] dc395x: convert to use the data buffer accessors
  [SCSI] ncr53c8xx: convert to use the data buffer accessors
  [SCSI] sym53c8xx: convert to use the data buffer accessors
  [SCSI] ppa: coding police and printk levels
  [SCSI] aic7xxx_old: remove redundant GFP_ATOMIC from kmalloc
  [SCSI] i2o: remove redundant GFP_ATOMIC from kmalloc from device.c
  [SCSI] remove the dead CYBERSTORMIII_SCSI option
  [SCSI] don't build scsi_dma_{map,unmap} for !HAS_DMA
  [SCSI] Clean up scsi_add_lun a bit
  [SCSI] 53c700: Remove printk, which triggers because of low scsi clock on SNI RMs
  [SCSI] sni_53c710: Cleanup
  [SCSI] qla4xxx: Fix underrun/overrun conditions
  [SCSI] megaraid_mbox: use mutex instead of semaphore
  [SCSI] aacraid: add 51245, 51645 and 52245 adapters to documentation.
  [SCSI] qla2xxx: update version to 8.02.00-k1.
  [SCSI] qla2xxx: add support for NPIV
  [SCSI] stex: use resid for xfer len information
  [SCSI] Add Brownie 1200U3P to blacklist
  [SCSI] scsi.c: convert to use the data buffer accessors
  ...
2007-07-15 16:51:54 -07:00
Matthias Kaehlcke a69228deef USB: drivers/block/ub.c: use list_for_each_entry()
Low performance USB storage driver: Use list_for_each_entry() instead
of list_for_each()

Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-12 16:34:40 -07:00
Tejun Heo 7b595756ec sysfs: kill unnecessary attribute->owner
sysfs is now completely out of driver/module lifetime game.  After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners.  Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner.  Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

  http://article.gmane.org/gmane.linux.kernel/510293

(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Paul Mackerras bf22f6fe2d Merge branch 'for-2.6.23' into merge 2007-07-11 13:28:26 +10:00
Linus Torvalds 01370f0603 Merge branch 'splice-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block
* 'splice-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block:
  pipe: add documentation and comments
  pipe: change the ->pin() operation to ->confirm()
  Remove remnants of sendfile()
  xip sendfile removal
  splice: completely document external interface with kerneldoc
  sendfile: remove bad_sendfile() from bad_file_ops
  shmem: convert to using splice instead of sendfile()
  relay: use splice_to_pipe() instead of open-coding the pipe loop
  pipe: allow passing around of ops private pointer
  splice: divorce the splice structure/function definitions from the pipe header
  splice: relay support
  sendfile: convert nfsd to splice_direct_to_actor()
  sendfile: convert nfs to using splice_read()
  loop: convert to using splice_direct_to_actor() instead of sendfile()
  splice: add void cookie to the actor data
  sendfile: kill generic_file_sendfile()
  sendfile: remove .sendfile from filesystems that use generic_file_sendfile()
  sys_sendfile: switch to using ->splice_read, if available
  vmsplice: add vmsplice-to-user support
  splice: abstract out actor data
2007-07-10 13:51:06 -07:00
Jan Engelhardt fd11d171e5 Make a "menuconfig" out of the Kconfig objects "menu, ..., endmenu",
so that the user can disable all the options in that menu at once
instead of having to disable each option separately.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 13:43:30 +02:00
Stephen Rothwell 36a700307e [POWERPC] Fix viodasd geometry calculations
Commit a885c8c431 that introduced the
getgeo block device method changed the fallback number of sectors and
introduced a bug into the fallback cylinder number calculation.

Thanks to Rusty Russell for noticing this.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-07-10 21:40:28 +10:00
Jens Axboe cac36bb06e pipe: change the ->pin() operation to ->confirm()
The name 'pin' was badly chosen, it doesn't pin a pipe buffer
in the most commonly used sense in the kernel. So change the
name to 'confirm', after debating this issue with Hugh
Dickins a bit.

A good return from ->confirm() means that the buffer is really
there, and that the contents are good.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:04:15 +02:00
Jens Axboe d6b29d7cee splice: divorce the splice structure/function definitions from the pipe header
We need to move even more stuff into the header so that folks can use
the splice_to_pipe() implementation instead of open-coding a lot of
pipe knowledge (see relay implementation), so move to our own header
file finally.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:04:14 +02:00
Jens Axboe fd5821404e loop: convert to using splice_direct_to_actor() instead of sendfile()
This gets rid of the dependency on ->sendfile() for receiving data
and converts loop to ->splice_read() instead.

Also includes an IV offset fix from Hugh Dickins.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:04:14 +02:00
Boaz Harrosh e654bc4393 [PATCH] fix request->cmd == INT cases
- I have unearthed very old bugs in stale drivers that still
   used request->cmd as a READ|WRITE int
 - This patch is maybe a proof that these drivers have not been
   used for a long time. Should they be removed completely?

Drivers that currently do not work for sure:
 drivers/acorn/block/fd1772.c |    2 +-
 drivers/acorn/block/mfmhd.c  |    8 ++++----
 drivers/cdrom/aztcd.c        |    2 +-
 drivers/cdrom/cm206.c        |    2 +-
 drivers/cdrom/gscd.c         |    2 +-
 drivers/cdrom/mcdx.c         |    2 +-
 drivers/cdrom/optcd.c        |    2 +-
 drivers/cdrom/sjcd.c         |    2 +-

Drivers with cosmetic fixes only:
  b/drivers/block/amiflop.c
  b/drivers/block/nbd.c
  b/drivers/ide/legacy/hd.c

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:03:34 +02:00
Mike Miller (OS Dev 9cff3b383d cciss: add new controller support for P700m
This patch adds support for the Smart Array P700m SAS controller. This new
controller will ship Fall 2007.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:03:33 +02:00
Jens Axboe c2bcf3b897 [PATCH] Remove acsi.c
Originally from Boaz Harrosh <bharrosh@panasas.com>

It hasn't been working in 2.5 or 2.6 ever, since it's still buffer_head
based.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-10 08:03:33 +02:00
Ken Chen a47653fc26 loop: preallocate eight loop devices
The kernel on-demand loop device instantiation breaks several user space
tools as the tools are not ready to cope with the "on-demand feature".  Fix
it by instantiate default 8 loop devices and also reinstate max_loop module
parameter.

Signed-off-by: Ken Chen <kenchen@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-08 17:23:32 -07:00
James Bottomley 5bc65793cb [SCSI] Merge up to linux-2.6 head
Conflicts:

	drivers/scsi/jazz_esp.c

Same changes made by both SCSI and SPARC trees: problem with UTF-8
conversion in the copyright.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-05-30 23:57:05 -05:00
FUJITA Tomonori 41ce639a1c [SCSI] cciss: convert to use the data buffer accessors
- remove the unnecessary map_single path.

- convert to use the new accessors for the sg lists and the
parameters.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Mike Miller <Mike.Miller@hp.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-05-29 11:22:13 -05:00
Eric Sesterhenn / Snakebyte 4acb3e2f97 Off by one in floppy.c
Another coverity patch i forgot to resend, original thread here
http://marc.info/?l=linux-kernel&m=115144559823592&w=2

In case drive == N_DRIVE, we get one past the drive_params array.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23 20:14:15 -07:00
Gerald Britton e9ca75b535 cciss: Fix pci_driver.shutdown while device is still active
Fix an Oops in the cciss driver caused by system shutdown while a filesystem
on a cciss device is still active.  The cciss_remove_one function only
properly removes the device if the device has been cleanly released by its
users, which is not the case when the pci_driver.shutdown method is called.

This patch adds a new cciss_shutdown function to better match the pattern
used by various SCSI drivers: deactivate device interrupts and flush caches.
It also alters the cciss_remove_one function to match and readds the
__devexit annotation that was removed when cciss_remove_one was serving as
the pci_driver.shutdown method.

Signed-off-by: Gerald Britton <gbritton@alum.mit.edu>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-18 21:03:49 -07:00
Al Viro 705962ccc9 fix deadlock in loop.c
... doh

Jeremy Fitzhardinge noted that the recent loop.c cleanups worked, but
cause lockdep to complain.

Ouch.  OK, the deadlock is real and yes, I'm an idiot.  Speaking of which,
we probably want to s/lock/pin/ in drivers/base/map.c to avoid such
brainos again.  And yes, this stuff needs clear documentation.  Will try
to put one together once I get some sleep...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-13 09:44:05 -07:00
Al Viro 07002e9956 fix the dynamic allocation and probe in loop.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Ken Chen <kenchen@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-12 16:53:02 -07:00
Martin Schwidefsky 61d48c2c31 [S390] Kconfig: use common Kconfig files for s390.
Disband drivers/s390/Kconfig, use the common Kconfig files. The s390
specific config options from drivers/s390/Kconfig are moved to the
respective common Kconfig files.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-05-10 15:46:08 +02:00
Linus Torvalds 9a9136e270 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
  sound: convert "sound" subdirectory to UTF-8
  MAINTAINERS: Add cxacru website/mailing list
  include files: convert "include" subdirectory to UTF-8
  general: convert "kernel" subdirectory to UTF-8
  documentation: convert the Documentation directory to UTF-8
  Convert the toplevel files CREDITS and MAINTAINERS to UTF-8.
  remove broken URLs from net drivers' output
  Magic number prefix consistency change to Documentation/magic-number.txt
  trivial: s/i_sem /i_mutex/
  fix file specification in comments
  drivers/base/platform.c: fix small typo in doc
  misc doc and kconfig typos
  Remove obsolete fat_cvf help text
  Fix occurrences of "the the "
  Fix minor typoes in kernel/module.c
  Kconfig: Remove reference to external mqueue library
  Kconfig: A couple of grammatical fixes in arch/i386/Kconfig
  Correct comments in genrtc.c to refer to correct /proc file.
  Fix more "deprecated" spellos.
  Fix "deprecated" typoes.
  ...

Fix trivial comment conflict in kernel/relay.c.
2007-05-09 12:54:17 -07:00
Nate Diller 01f2705daf fs: convert core functions to zero_user_page
It's very common for file systems to need to zero part or all of a page,
the simplist way is just to use kmap_atomic() and memset().  There's
actually a library function in include/linux/highmem.h that does exactly
that, but it's confusingly named memclear_highpage_flush(), which is
descriptive of *how* it does the work rather than what the *purpose* is.
So this patchset renames the function to zero_user_page(), and calls it
from the various places that currently open code it.

This first patch introduces the new function call, and converts all the
core kernel callsites, both the open-coded ones and the old
memclear_highpage_flush() ones.  Following this patch is a series of
conversions for each file system individually, per AKPM, and finally a
patch deprecating the old call.  The diffstat below shows the entire
patchset.

[akpm@linux-foundation.org: fix a few things]
Signed-off-by: Nate Diller <nate.diller@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:55 -07:00
WANG Cong 84963048ca nbd: check the return value of sysfs_create_file
[akpm@linux-foundation.org: fix it]
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:49 -07:00
Michael Opdenacker 59c51591a0 Fix occurrences of "the the "
Signed-off-by: Michael Opdenacker <michael@free-electrons.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-05-09 08:57:56 +02:00
Stephen Cameron d5d3b736e3 cciss: include scsi/scsi.h unconditionally
Make cciss unconditionally include scsi/scsi.h, because of the use of
SCSI_IOCTL_GET_IDLUN and SCSI_IOCTL_GET_BUS_NUMBER.

Signed-off-by: Stephen M. Cameron <steve.cameron@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:10 -07:00
Mike Miller (OS Dev) 198b766013 cciss: set rq->errors more correctly in driver
Set rq->errors more correctly in cciss driver.  Previously we had set it
synonymously with the meaning of the last parameter of end_that_last_request
and complete_buffers (the "uptodate" parameter) and had gotten away with it
for all this time because nobody ever looked at rq->errors.
SCSI_IOCTL_SEND_COMMAND looks at rq->errors, so now it matters that it be
right.

Signed-off-by: Stephen M. Cameron <steve.cameron@hp.com>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:09 -07:00
Mike Miller (OS Dev) 03bbfee58d cciss: add SG_IO ioctl to cciss
For all of you that think cciss should be a scsi driver here is the patch that
you have been waiting for all these years. This patch actually adds the SG_IO
ioctl to cciss. The primary purpose is for clustering and high-availibilty.
But now anyone can exploit this ioctl in any manner they wish.

Note, SCSI_IOCTL_SEND_COMMAND doesn't work with this patch due to rq->errors
being set incorrectly.  Subsequent patch fixes that.

Signed-off-by: Stephen M. Cameron <steve.cameron@hp.com>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:09 -07:00
Mike Miller (OS Dev) d38ae168bf cciss: reformat error handling
Reformat some error handling code to reduce line lengths a bit.

Signed-off-by: Stephen M. Cameron <steve.cameron@hp.com>
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:09 -07:00
Ken Chen 7328508274 remove artificial software max_loop limit
Remove artificial maximum 256 loop device that can be created due to a
legacy device number limit.  Searching through lkml archive, there are
several instances where users complained about the artificial limit that
the loop driver impose.  There is no reason to have such limit.

This patch rid the limit entirely and make loop device and associated block
queue instantiation on demand.  With on-demand instantiation, it also gives
the benefit of not wasting memory if these devices are not in use (compare
to current implementation that always create 8 loop devices), a net
improvement in both areas.  This version is both tested with creation of
large number of loop devices and is compatible with existing losetup/mount
user land tools.

There are a number of people who worked on this and provided valuable
suggestions, in no particular order, by:

Jens Axboe
Jan Engelhardt
Christoph Hellwig
Thomas M

Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Randy Dunlap e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Dmitriy Monakhov 4ea1b0f4c4 floppy: handle device_create_file() failure while init
This patch kills the "ignoring return value of 'device_create_file'"
warning message.

Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:02 -07:00
Peter Zijlstra f98393a64c mm: remove destroy_dirty_buffers from invalidate_bdev()
Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't
been used in 6 years (so akpm says).

find * -name \*.[ch] | xargs grep -l invalidate_bdev |
while read file; do
	quilt add $file;
	sed -ie 's/invalidate_bdev(\([^,]*\),[^)]*)/invalidate_bdev(\1)/g' $file;
done

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07 12:12:55 -07:00
Pavel Emelianov 7562f876cd [NET]: Rework dev_base via list_head (v3)
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 15:13:45 -07:00
Pete Zaitcev 643616e678 ub: Bind to first endpoint, not to last
The usb-storage switched to binding to first endpoint recently. Apparently,
there are devices out there with extra endpoints. It is perfectly legal.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-04-27 13:28:34 -07:00
Arnaldo Carvalho de Melo c1d2bbe1cd [SK_BUFF]: Introduce skb_reset_network_header(skb)
For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:46 -07:00
Arnaldo Carvalho de Melo 98e399f82a [SK_BUFF]: Introduce skb_mac_header()
For the places where we need a pointer to the mac header, it is still legal to
touch skb->mac.raw directly if just adding to, subtracting from or setting it
to another layer header.

This one also converts some more cases to skb_reset_mac_header() that my
regex missed as it had no spaces before nor after '=', ugh.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:41 -07:00
Arnaldo Carvalho de Melo 459a98ed88 [SK_BUFF]: Introduce skb_reset_mac_header(skb)
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:32 -07:00
Arnaldo Carvalho de Melo 029720f15d [AOE]: Introduce aoe_hdr()
For consistency with other skb->mac.raw users.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:24:28 -07:00
Andrew Morton cbc31a475a packet: fix error handling
The packet driver is assuming (reasonably) that the (undocumented)
request.errors is an errno.  But it is in fact some mysterious bitfield.  When
things go wrong we return weird positive numbers to the VFS as pointers and it
goes oops.

Thanks to William Heimbigner for reporting and diagnosis.

(It doesn't oops, but this driver still doesn't work for William)

Cc: William Heimbigner <icxcnika@mar.tar.cc>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-25 13:50:55 -07:00
Alexey Dobriyan 671d40f4aa paride drivers: initialize spinlocks
pcd_lock and pf_spin_lock are passed to blk_init_queue() which, seeing them
as valid lock pointer, sets it as ->queue_lock.

The problem is that pcd_lock and pf_spin_lock aren't initialized anywhere.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-24 08:23:08 -07:00
Bjorn Helgaas b6550777a3 [PATCH] cciss: unregister from SCSI before tearing down device resources
We must unregister from SCSI before we unmap device resources and unhook
the IRQ handler.  Otherwise, SCSI may send us more requests, and we won't
be able to handle them.

I see the following oops during every reboot of my HP DL360:

    ...
    Unmounting local filesystems...done.
    Rebooting... Completed flushing cache on controller 0
    BUG: unable to handle kernel paging request at virtual address f8808040
     printing eip:
    c02dc72b
    *pde = 02120067
    *pte = 00000000
    Oops: 0002 [#1]
    SMP
    Modules linked in:
    CPU:    1
    EIP:    0060:[<c02dc72b>]    Not tainted VLI
    EFLAGS: 00010046   (2.6.21-rc6 #1)
    EIP is at SA5_submit_command+0xb/0x20
    eax: f8808000   ebx: f7a00000   ecx: f79f0000   edx: 37a00000
    esi: f79f0000   edi: 00000000   ebp: 00000000   esp: dd717a44
    ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
    Process khelper (pid: 1427, ti=dd716000 task=c2260a70 task.ti=dd716000)
    Stack: c02df2c0 f7a00000 f7a00000 00d41008 c02df691 00000000 00000010 00000002
	   00000001 f79f0000 f7fff844 c1398420 00000000 00000000 00001000 230a3020
	   69666564 5420656e 50434f49 465f544b 4853554c 44414552 0a312009 66656423
    Call Trace:
     [<c02df2c0>] start_io+0x80/0x120
     [<c02df691>] do_cciss_request+0x331/0x350
     [<c014242a>] mempool_alloc+0x2a/0xe0
     [<c020ad71>] blk_alloc_request+0x61/0x80
     [<c020b02e>] get_request+0x15e/0x1e0
     [<c01595e0>] cache_alloc_refill+0xb0/0x1e0
     [<c021049d>] as_update_rq+0x2d/0x80
     [<c0210d28>] as_add_request+0x68/0x90
     [<c0207f99>] elv_insert+0x119/0x160
     [<c020bd0b>] __make_request+0xcb/0x320
     [<c0122ee0>] lock_timer_base+0x20/0x50
     [<c0123096>] del_timer+0x56/0x60
     [<c020a7b8>] blk_remove_plug+0x38/0x70
     [<c020a815>] __generic_unplug_device+0x25/0x30
     [<c020a835>] generic_unplug_device+0x15/0x30
    ...

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-12 15:31:42 -07:00
Mike Miller (OS Dev) 7f42d3b8a7 [PATCH] cciss: add init of drv->cylinders back to cciss_geometry_inquiry
This patch adds initialization of drv->cylinders back into the failing case in
cciss_geometry_inquiry. I inadvertently removed it in one my 2TB updates.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-04 21:12:47 -07:00
Al Viro 27d871833e [PATCH] paride endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 15:27:50 -07:00
Al Viro 4c1f2b3168 [PATCH] cciss endian annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 15:27:50 -07:00