Title of Invention

"MULTI-PROTOCOL STORAGE APPLIANCE THAT PROVIDES INTEGRATED SUPPORT FOR FILE AND BLOCK ACCESS PROTOCOLS"

Abstract A multi-protocol storage appliance serves file and block protocol access to information stored on storage devices in an integrated manner for both network attached storage (NAS) and storage area network (SAN) deployment A storage operating system of the appliance implements a file system (320) that cooperates with novel vinualization modules to provide a vinulization system (300) that vinulizes" the storage space ptovided by the devices The 51e system provides volume management capabilities for use in block-based access to the information stored on the devices. The vinualization system (300 ) allows the file system to logically organize the information as named file (324), directory (326) and virtual disk storage objects (322. 326) to thereby provide an integrated NA-S and SAN appliance approach to storage by enabling file-based access to the files and directories while ibrthei" enabling block-h;\sed acceii i.o the virrulization
Full Text The present invention relates to storage systems and, in particular, to a multiprotocol storage appliance that supports file and block access protocols.
BACKGROUND OF THE INVENTION
A storage system is a computer that provides storage service relating to the organization of information on writable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each "on-disk" file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
The file server, or filer, may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the filer. Sharing of files is a hall-mark of aNAS system, which is enabled because of its semantic level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet, that allow clients to remotely access the information (files) on the filer. The clients typically communicate with the filer by ex-
changing discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
In the client/server model, the client may comprise an application executing on a computer that "connects" to the filer over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the file system over the network identifying one or more files to be accessed without regard to specific locations, e.g., blocks, in which the data are stored on disk. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the filer may be enhanced for networking clients.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the "extended bus". In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC or TCP/IP/Ethemet.
A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of information storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. In some SAN doployments, the information is organized in the form of databases, while in others a file-based organization is employed. Where the information is organized as files, the client requesting the information maintains file mappings and manages file semantics, while its requests (and sewer responses) address the information in terms of block addressing on disk using, e.g., a logical unit number (lun).
Previous approaches generally address the SAN and NAS environments using two separate solutions. For those approaches that provide a single solution for both environments, the NAS capabilities are typically "disposed" over the SAN storage system platform using, e.g., a "sidecar" device attached to the SAN platform. However, even these prior systems typically divide storage into distinct SAN and NAS storage domains. That is, the storage spaces for the SAN and NAS domains do not coexist and are physically partitioned by a configuration process implemented by, e.g., a user (system administrator).
An example of such a prior system is the Symmetrix© system platform available from EMC® Corporation. Broadly stated, individual disks of the SAN storage system (Symmetrix system) are allocated to a NAS sidecar device (e.g., Celerra™ device) that, in turn, exports those disks to NAS clients via, e.g., the NFS and CIFS protocols. A system administrator makes decisions as to the number of disks and the locations of "slices" (extents) of those disks that are aggregated to construct "user-defined volumes" and, thereafter, how those volumes are used. The term 'Volume" as conventionally used in a SAN environment implies a storage entity that is constructed by . specifying physical disks and extents within those disks via operations that combine those extents/disks into a user-defined volume storage entity. Notably, the SAN-based disks and NAS-based disks comprising the user-defined volumes are physically partitioned within the system platform.
Typically, the system administrator renders its decisions through a complex user interface oriented towards users that are knowledgeable about the underlying physical aspects of fhe system. That is, the user interface revolves primarily around physical disk structures and management that a system administrator must manipulate in order to present a view of the SAN platform on behalf of a client. For example, the user interface may prompt the administrator to specify the physical disks, along with the sizes of extents within those disks, needed to construct the user-defined volume. In addition, the interface prompts the administrator for the physical locations of those extents and disks, as well as the manner in which they are "glued together" (organized) and made visible (exported) to a SAN client as a user-defined volume corresponding to a disk or lun. Once the physical disks and their extents are selected to construct a volume, only
those disks/extents comprise that volume. The system administrator must also specify the form of reliability, e.g., a Redundant Array of Independent {ox Inexpensive) Disks (RAID) protection level and/or mirroring, for mat constructed volume. RAID groups are then overlaid on top of those selected disks/extents.
In sum, the prior system approach requires a system adrrhnistrator to finely configure the physical layout of the disks and their organization to create a user-defined volume that is exported as a single lun to a SAN client. All of the administration associated with this prior approach is grounded on a physical disk basis. For the system administrator to increase the size of the user-defined volume, disks are added and RAID calculations are re-computed to include redundant information associated with data stored on the disks constituting the volume. Clearly, this is a complex and costly approach. The present invention is directed to providing a simple and efficient integrated solution to SAN and NAS storage environments.
SUMMARY OF THE INVENTION
The present invention relates to a multi-protocol storage appliance that serves file and block protocol access to information stored on storage devices in an integrated manner for both network attached storage (NAS) and storage area network (SAN) deployments. A storage operating system of the appliance implements a file system that cooperates, with novel virtualization modules to provide a virtualization system that "virtualizes" the storage space provided by the devices. Notably, the file system provides volume management capabilities for use in block-based access to the information stored on the devices. The virtualization system allows the file system to logically organize the information as named file, directory and virtual disk (vdisk) storage objects to thereby provide an integrated NAS and SAN appliance approach to storage by enabling file-based access to the files and directories, while further enabling block-based access to the vdisks.
In the illustrative embodiment, the virtualization modules are embodied, e.g., as a vdisk module and a Small Computer Systems Interface (SCSI) target module. The vdisk module provides a data path from the block-based SCSI target module to blocks managed by the file system. The vdisk module also interacts with the file system to
enable access by administrative interfaces, such as a streamlined user interface (UI), in response to a system administrator issuing commands to the multi-protocol storage appliance, [n addition, the vdisk module manages SAN deployments by, among other things, implementing a comprehensive ser of vdisk commands issued through the UI by a system administrator. These vdisk commands are converted to primitive file system operations that interact with the file system and the SCSI target module to implement the vdisks.
The SCSI target module, in turn, initiates emulation of a disk or logical unit number (lun) by providing a mapping procedure that translates logical block access to luns speciified in access requests into virtual block access to vdislcs and, for responses to the requests, vdisks into luns. The SCSI target module thus provides a translation layer of the virtualization system between a SAN block (lun) space and a file system space, where luns are represented as vdisks. By "disposing" SAN virtualization over the file system, the multi-protocol storage appliance reverses the approaches taken by prior systems to thereby provide a single unified storage platform for essentially all storage access protocols.
Advantageously, the integrated multi-protocol storage appliance provides access controls arid, if appropriate, sharing of files and vdisks for all protocols, while preserving data integrity. The storage appliance further provides embedded/integrated virtualization capabilities that obviate the need for a user to apportion storage resources when creating NAS and SAN storage objects. These capabilities include a virtualized storage space that allows the SAN and NAS objects to coexist with respect to global space management within the appliance. Moreover, the integrated storage appliance provides simultaneous support for block access protocols to the same vdisk, as well as a heterogeneous SAN environment with support for clustering.
BRIEF DESCRIPTION OF THE DRAWINGS
The above and further advantages of invention may be better understood by referring to the following description in conjunction with the accompanying drawings in which like reference numerals indicate identical or functionally similar elements:
Fig. 1 is a schematic block diagram of a multi-protocol storage appliance configured to operate in storage area network (SAN) and network attached storage (NAS) environments in accordance with the present invention;
Fig. 2 is a schematic block diagram of a storage operating system of the multiprotocol storage appliance that may be advantageously used with the present invention;
Fig. 3 is a schematic block diagram of a virtualization system that is implemented by a file system interacting with virtualization modules according to the present invention; and
Fig. 4 is a flowchart illustrating the sequence of steps involved when accessing information stored on the multi-protocol storage appliance over a SAN network.
DETAILED DESCRIPTION OF AN D LUSTRATIVE EMBODIMENT
The present invention is directed to a multi-protocol storage appliance that serves both file and block protocol access to information stored on storage devices in an integrated manner. In this context, the integrated multi-protocol appliance denotes a computer having features such as simplicity of storage service management and ease of storage reconfiguration, including reusable storage space, for users (system administrators) and clients of network attached storage (NAS) and storage area network (SAN) deployments. The storage appliance may provide NAS services through a file system, while the same appliance provides SAN services through SAN virtualization, including logical unit number (iun) emulation.
Fig;. 1 is a schematic block diagram of the multi-protocol storage appliance 100 configured to provide storage service relating to the organization of information on storage devices, such as disks 130. The storage appliance 100 is illustratively embodied as a storage system comprising a processor 122, a memory 124, a plurality of network adapters 125,126 and a storage adapter 128 interconnected by a system bus 123. The multi-protocol storage appliance 100 also includes a storage operating system 200
that provides a virtualization system (and, in particular;, a file system) to logically organize the information as a hierarchical structure of named directory, file and virtual disk (vdisk) storage objects on the disks 130.
Whereas clients of a NAS-based network environment have a storage viewpoint of files, the clients of a SAN-based network environment have a storage viewpoint of blocks or disks. To that end, the multi-protocol storage appliance 100 presents (exports) disks to SAN clients through the creation of luns or vdisk objects. A vdisk object (hereinafter "vdisk") is a special file type that is implemented by the virtualization system and translated into an emulated disk as viewed by the SAN clients. The multiprotocol storage appliance thereafter malces these emulated disks accessible to the SAN clients through controlled exports, as described further herein.
In the illustrative embodiment, the memory 124 comprises storage locations that are addressable by the processor and adapters for storing software program code and data structures associated with the present invention. The processor and adapters may, in turn, comprise processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures. The storage operating system 200, - portions of which are typically resident in memory and executed by the processing elements, functionally organizes the storage appliance by, inter alia, invoking storage operations in support of the storage service implemented by the appliance. It will be apparent to those skilled in the art that other processing and memory means, including various computer readable media, may be used for storing and executing program instructions pertaining to the inventive system and method described herein.
The network adapter 125 couples the storage appliance to a plurality of clients l60a,b over point-to-point links, wide area networks, virtual private networks implemented over a public network (Internet) or a shared local area network, hereinafter referred to as an illustrative Ethernet network 165. Therefore, the network adapter 125 may comprise a network interface card (NIC) having the mechanical, electrical and signaling circuitry needed to connect the appliance to a network switch, such as a conventional Ethernet switch 170. For this NAS-based network environment, the clients are configured to access information stored on the multi-protocol appliance as files. The clients 160 communicate with the storage appliance over network 165 by ex-

changiBg discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
The clients 160 may be general-purpose computers configured to execute applications over a variety of operating systems, including the UNIX® and Microsoft® Windows™ operating systems. Client systems generally utilize file-based access protocols when accessing information (in the form of files and directories) over a NAS-based network. Therefore, each client 160 may request the services of the storage appliance 100 by issuing file access protocol messages (in the form of packets) to the appliance over the network 165. For example, a client 160a running the Windows operating system may communicate with the storage appliance 100 using the Common Internet File System (CIFS) protocol over TCP/IP. On the other hand, a client 160b running the UNIX operating system may communicate with the multi-protocol appliance using either the Network File System (NFS) protocol over TCP/IP or the Direct Access File System (DAFS) protocol over a virtual interface (VT) transport in accordance with a remote DMA (RDMA) protocol over TCP/IP. It will be apparent to those skilled in the art that other clients running other types of operating systems may also communicate with the integrated multi-protocol storage appliance using other file access protocols.
The storage network "target" adapter 126 also couples the multi-protocol storage appliance 100 to clients 160 that may be further configured to access the stored information as blocks or disks. Fox this SAN-based network environment, the storage appliance is coupled to an illustrative Fibre Channel (FC) network 185. FC is a networking standard describing a suite of protocols and media that is primarily found in SAN deployments. The network target adapter 126 may comprise a FC host bus adapter (HB A) having the mechanical, electrical and signaling circuitry needed to connect the appliance 100 to a SAN network switch, such as a conventional FC switch 180. In addition to providing FC access, the FC HBA may offload fiber channel network processing operations for the storage appliance.
The clients 160 generally utilize block-based access protocols, such as the Small Computer Systems Interface (SCSI) protocol, when accessing information (in the form of blocks, disks or vdisks) over a SAN-based network. SCSI is a peripheral in-
put/output (I/O) interface with a standard, device independent protocol that allows different peripheral devices, such as disks 130, to attach to the storage appliance 100. In SCSI terminology, clients 160 operating in a SAN environment are initiators that initiate requests and commands for data. The multi-protocol storage appliance is thus a target configured to respond to the requests issued by the initiators in accordance with a request/response protocol. The initiators and targets have endpoint addresses that, in accordance with the FC protocol, comprise worldwide names (WWN). A WWN is a unique identifier, e.g., a node name or a port name, consisting of an 8-byte number.
The multi-protocol storage appliance 100 supports various SCSI-based protocols used in SAN deployments, including SCSI encapsulated over TCP (iSCSI) and SCSI encapsulated over FC (FCP). The initiators (hereinafter clients 160) may thus request the services of the target (hereinafter storage appliance 100) by issuing iSCSI and FCP messages over the network 165,185 to access information stored on the disks. It will be apparent to those skilled in the art that the clients may also request the services of the integrated multi-protocol storage appliance using other block access protocols. By supporting a plurality of block access protocols, the multi-protocol storage appliance provides a unified and coherent access solution to vdisks/luns in a heterogeneous SAN environment
The storage adapter 128 cooperates with the storage operating system 200 executing on the storage appliance to access mforrnation requested by the clients. The information may be stored on the disks 130 or other similar media adapted to store in-formation. The storage adapter includes I/O interface circuitry that couples to the disks over an I/O interconnect arrangement, such as a conventional high-performance, FC serial link topology. The information is retrieved by the storage adapter and, if necessary, processed by the processor 122 (or the adapter 128 itself) prior to being forwarded over the system bus 123 to the network adapters 125,126, where the mfoimation is formatted into packets or messages and returned to the clients.
Storage of information on the appliance 100 is preferably implemented as one or more storage volumes (e.g., VOL1-2 150) that comprise a cluster of physical storage disks 130, defining an overall logical arrangement of disk space. The disks within a volume are typically organized as one or more groups of Redundant Array of Inde-
pendent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability/integrity of data storage through the writing of data "stripes" across a given number of physical disks in the RAID group, and the appropriate storing of redundant information with respect to the striped data. The redundant information enables recovery of data lost when a storage device fails. It will be apparent to those skilled in the art that other redundancy techniques, such as mirroring, may used in accordance with the present invention.
Specifically, each volume 150 is constructed from an array of physical disks 130 that are organized as RAID groups 140,142, and 144. The physical disks of each RAID group include those disks configured to store striped data (D) and those configured to store parity (P) for the data, in accordance with an illustrative RAID 4 level configuration. It should be noted that other RAID level configurationsi(e.g. RAID 5) are also contemplated for use with the teachings described herein. In tl>e illustrative embodiment, a minimum of one parity disk and one data disk may be employed. However, a typical implementation may include three data and one parity disk per RAID group and at least one RAID group per volume.
To facilitate access to the disks 130, the storage operating systejm 200 implements a write-anywhere file system of a novel virtualization system that "virtualizes" the storage space provided by disks 130. The file system logically organizes the information as a hierarchical structure of named directory and file objects (hereinafter "directories": and "files") on the disks. Each "on-disk" file may be implemented as set of disk blocks configured to store information, such as data, whereas the directory may be implemented as a specially formatted file in which names and links to|other files and directories are stored. The virtualization system allows the file system to further logically organize information as a hierarchical structure of named vdisks on the disks, thereby providing an integrated NAS and SAN appliance approach to storage by enabling file-based (NAS) access to the named files and directories, while further enabling block-based (SAN) access to the named vdisks on a file-based storage platform. The file system simplifies the complexity of management of the underlying physical storage in SAN deployments.
As noted, a vdisk is a special file type in a volume that derives from a plain (regular) file, but that has associated export controls and operation restrictions that support emulation of a disk. Unlike a file that can be created by a client using, e.g., the NFS or CIFS protocol, a vdisk is created on the multi-protocol storage appliance via, e.g. a user interface (UI) as a special typed file (object). Illustratively, the vdisk is a multi-inode object comprising a special file inode that holds data and at least one associated stream mode that holds attributes, including security information. The special file inode functions as a main container for storing data, such as application data, associated with the emulated disk- The stream inode stores attributes that allow luns and exports to persist over, e.g., reboot operations, while also enabling management of the vdisk as a single disk object in relation to SAN clients. An example of a vdisk and its associated modes that may be advantageously used with the present invention is described in co-pending and commonly assigned U.S. Patent Application Serial No. (112056-0069) titled Storage Virtaalization by Layering Vdisks on a File System, which application is hereby incorporated by reference as though fully set forth) herein.
In the illustrative embodiment, the storage operating system is preferably the NetApp® Data ONTAP™ operating system available from Network Appliance, Inc., Sunnyvale, California that implements a Write Anywhere File Layout (WAFL™) file system. However, it is expressly contemplated that any appropriate stotage operating system, including a write in-place file system, may be enhanced for use; in accordance with the inventive principles described herein. As such, where the term "WAFL" is employed, it should be taken broadly to refer to any storage operating system that is otherwise adaptable to the teachings of this invention.
As used herein, the term "storage operating system" generally rjsfers to the computer-executable code operable on a computer that manages data access and may, in the case of a multi-protocol storage appliance, implement data access semantics, such as the Data ONTAP storage operating system, which is implemented as a microkernel. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows NT®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
In addition, it will be understood to those skilled in the art that the inventive system and method described herein may apply to any type of special-purpose (e.g., storage serving appliance) or general-purpose computer, including a standalone computer or portion thereof, embodied as or including a storage system. Moreover, the teachings of this invention can be adapted to a variety of storage system architectures including, but not limited to, a network-attached storage environment, a storage area network and disk assembly directly-attached to a client or host computer. The term "storage system" should therefore be taken broadly to include such arrangements in addition to any subsystems configured to perform a storage function and associated with other equipment or systems.
Fig. 2 is a schematic block diagram of the storage operating system 200 that may be advantageously used with the present invention. The storage operating system comprises a series of software layers organized to form an integrated network protocol stack or, more generally, a multi-protocol engine that provides data paths for clients to access information stored on the multi-protocol storage appliance using block and file aceess protocols. The protocol stack includes a media access layer 210 of network drivers (e.g., gigabit Ethernet drivers) that mterfaces to network protocol layers, such as the IP layer 212 and its supporting transport mechanisms, the TCP layer 214 and the User Datagram Protocol (UDP) layer 216. A file system protocol layer provides multiprotocol file access and, to that end, includes support for the DAFS protocol 218, the NFS protocol 220, the CIFS protocol 222 and the Hypertext Transfer Protocol (HTTP) protocol 224. A VI layer 226 implements the VI architecture to provide direct access transport (DAT) capabilities, such as RDMA, as required by the DAFS protocol 218.
An iSCSI driver layer 228 provides block protocol access over the TCP/IP network protocol layers, while a FC driver layer 230 operates with the FC HBA126 to receive and transmit block access requests and responses to and from the integrated storage appliance. The FC and iSCSI drivers provide FC-specific and iSCSI-specific access control to the luns (vdisks) and, thus, manage exports of vdisks to either iSCSI or FCP or, alternatively, to both iSCSI and FCP when accessing a single vdisk on the multi-protocol storage appliance. In addition, the storage operating system includes a disk storage layer 240 that implements a disk storage protocol, such as a RAID proto-
col, and a disk driver layer 250 that implements a disk access protocol such as, e.g., a SCSI protocol.
Bridging the disk software layers with the integrated network protocol stack layers is a virtualization system 300 according to the present invention. Fig. 3 is a schematic block diagram of the virtualization system 300 that is implemented by a file system 320 cooperating with virtualization modules illustratively embodied as, e.g., vdisk module 330 and SCSI target module 310. It should be noted that the vdisk module 330, file system 320 and SCSI target module 310 can be implemented in software, hardware, firmware, or a combination thereof- The vdisk module 330 is layered on (and interacts with) the file system 320 to provide a data path from the block-based SCSI target module to blocks managed by the file system. The vdisk module also enables access by administrative interfaces, such as a streamlined user interface (UI350), in response to a system administrator issuing commands to the multi-protocol storage appliance 100. In essence, the vdisk module 330 manages SAN deployments by, among other things, implementing a comprehensive set of vdisk (km) commands issued through the UI 350 by a system administrator. These vdisk commands are converted to primitive file system operations ("primitives") that interact with the file system 320 and the SCSI target module 310 to implement the vdisks.
The SCSI target module 310, in turn, initiates emulation of a disk or lun by providing a mapping procedure mat translates logical block access to luns specified in access requests into virtual block access to the special vdisk file types and, for responses to the requests, vdisks into luns. The SCSI target module is illustratively disposed between the ,FC and iSCSI drivers 228,230 and the file system 320 to thereby provide a translation layer of the virtualization system 300 between me SAN block (lun) space and the file system space, where luns are represented as vdisks. By "disposing" SAN virtualization over the file system 320, the multi-protocol storage appliance reverses the approaches taken by prior systems to thereby provide a single unified storage platform for essentially all storage access protocols.
According to the invention, file system provides capabilities for use in file-based access to information stored on the storage devices, such as disks. In addition, the file system provides volume management capabilities for use in block-based access
to the stored information. That is, in addition to providing file system semantics (such as differentiation of storage into discrete objects and naming of those storage objects), the file system 320 provides functions normally associated with a volume manager. As described herein, these functions include (i) aggregation of the disks, (ii) aggregation of storage bandwidth of the disks, and (iii) reliability guarantees, such as mirroring and/or parity (RAID), to thereby present one or more storage objects layered on the file system. A feature of the multi-protocol storage appliance is the simplicity of use associated with these volume management capabilities, particularly when used in SAN deployments.
The file system 320 illustratively implements the WAFL file system having an on-disk format representation that is block-based using, e.g., 4 kilobyte (kB) blocks and using inodes to describe the files. The WAFL file system uses files to store meta-data describing the layout of its file system; these meta-data files include, among others, an inode file. A file handle, i.e., an identifier that includes an inode number, is used to retrieve an inode from disk. A description of the structure of the file system, including the inode file, is provided in U.S. Patent No. 5,819,292, titled Method for Maintaining Consistent States of a File System and for Creating User-Accessible Read-Only Copies of a File System by David Hitz et al, issued October 6,1998, which patent is hereby incorporated by reference as though fully set forth herein.
Broadly stated, all inodes of the file system are organized into the inode file. A file system (FS) info block specifies the layout of information in the file system and includes an inode of a file that includes all other inodes of the file system. Each volume has an FS info block that is preferably stored at a fixed location within, e.g., a RAID group of the file system. The inode of the root FS info block may directly reference (point to) blocks of the inode file or may reference indirect blocks of the inode file that, in turn, reference direct blocks of the inode file, Within each direct block of the inode file are embedded inodes, each of which may reference indirect blocks that, in turn, reference data blocks of a file or vdisk.
According to an aspect of the invention, the file system implements access operations to vdisks 322, as well as to files 324 and directories (dir 326) that coexist with respect to global space management of units of storage, such as volumes 150 and/or
qtrees 328. A qtree 32S is a special directory that has the properties of a logical sub-volume within the namespace of a physical volume. Each file system storage object (file, directory or vdisk) is illustratively associated with one qtree, and quotas, security properties and other items can be assigned on a per-qtree basis. The vdisks and files/directories may be layered on top of qtrees 328 that, in turn, are layered on top of volumes 150 as abstracted by the file system 'Normalization" layer 320.
Note that the vdisk storage objects in the file system 320 are associated with SAN deployments of the multi-protocol storage appliance, whereas the file and directory storage objects are associated with NAS deployments of the appliance. The files and directories are generally not accessible via the FC or SCSI block access protocols; however, a file can be converted to a vdisk and then accessed by either the SAN or NAS protocol. The vdisks are accessible as mns from the SAN (FC and SCSI) proto-■ cols and as files by the NAS (NFS and CIFS) protocols.
In another aspect of the invention, the virtualization system 300 provides a vir-tualized storage space that allows SAN and NAS storage objects to coexist with respect to global space management by the file system 320. To that end, the virtualization system 300 exploits the characteristics of the file system, including its inherent ability to aggregate disks and abstract them into a single pool of storage. For example, the system 300 leverages the volume management capability of the file system 320 to organize a collection of disks 130 into one or more volumes 150 representing a pool of global storage space. The pool of global storage is then made available for both SAN and NAS deployments through the creation of vdisks 322 and files 324, respectively. In addition to sharing the same global storage space, the vdisks and files share the same pool of available storage from which to draw on when expanding the SAN and/or NAS deployments. Unlike prior systems, there is no physical partitioning of disks within the global storage space of the multi-protocol storage appliance.
The multi-protocol storage appliance substantially simplifies management of the global storage space by allowing a user to manage both NAS and SAN storage objects using the single pool of storage resources. In particular, free block space is managed from a global free pool on a fine-grained block basis for both SAN and NAS deployments. If those storage objects were managed discretely (separately), the user would be
required to keep a certain amount of "spare" disks on hand for each type of object to respond to changes in, e.g., business objectives. The overhead required to maintain that discrete approach is greater than if those objects could be managed out of a single pool of resources with only a single group of spared disks available for expansion as business dictates. Blocks released individually by vdisk operations are immediately reusable by NAS objects (and vice versa.) The details of such management are transparent to the administrator. This represents a "total cost of ownership" advantage of the integrated multi-protocol storage appliance.
The virtualization system 300 further provides reliability guarantees for those SAN and NAS storage objects coexisting in the global storage space of the multiprotocol appliance 100. In particular, reliability guarantees in face of disk failures through techniques such as RAID or mirroring performed at a physical block level in conventional SAN systems is an inherited feature from the file system 320 of the appliance 100. This simplifies administration by allowing an administrator to make global decisions on the underlying redundant physical storage that apply equally to vdisks and NAS objects in the file system.
As noted, the file system 320 organizes information as file, directory and vdisk objects within volumes 150 of disks 130- Underlying each volume 150 is a collection of RAID groups 140-144 that provide protection and reliability against disk feilure(s) within the volume. The information serviced by the multi-protocol storage appliance is protected according to an illustrative RAID 4 configuration. This level of protection may be extended to include, e.g., synchronous mirroring on the appliance platform. A vdisk 322 created on a volume that is protected by RAID 4 "inherits" the added protection of synchronous mirroring if that latter protection is specified for the volume 150. In this case, the synchronous rnirxoring protection is not a property of the vdisk but rather a property of the underlying volume and the reliability guarantees of the file system 320: This "inheritance" feature of the multi-protocol storage appliance simplifies management of a vdisk because a system administrator does not have to deal with reliability issues.
In addition, the virtualization system 300 aggregates bandwidth of the disks 130 without requiring user knowledge of the physical construction of those disks. The file
system 320 is configured to write (store) data on the disks as long, continuous stripes across those disks in accordance with input/output (I/O) storage operations that aggregate the bandwidth of all the disks of a volume for stored data. When information is stored or retrieved from the vdisks, the I/O operations are not directed to disks specified by a user- Rather, those operations are transparent to the user because the file system "stripes" that data across all the disks of the volume in a reliable manner according to its write anywhere layout policy. As a result of virtualization of block storage, I/O bandwidth to a vdisk can be the maximum bandwidth of the underlying physical disks of the file system, regardless of the size of the vdisk (unlike typical physical implementations of luns in conventional block access products.)
Moreover, the virtualization system leverages file system placement, management and block allocation policies to make the vdisks function correctly within the multi-protocol storage appliance. The vdisk block placement policies are a function of the underlying virtualizing file system and there are no permanent physical bindings of file system blocks to SCSI logical block addresses in the face of modifications. The vdisks may be transparently reorganized to perhaps alter data access pattern behaviour.
For both SAN and NAS deployments, the block allocation policies are independent of physical properties of the disks (e.g., geometries, sizes, cylinders, sector size). The file system provides file-based management of the files 324 and directories 326 and, in accordance with the invention, vdisks 322 residing within the volumes 150. When a disk is added to the array attached to the multi-protocol storage appliance, that disk is integrated into an existing volume to increase the entire volume space, which space may be used for any purpose, e.g., more vdisks or more files.
Management of the integrated multi-protocol storage appliance 100 is simplified through the use of the UI 350 and the vdisk command set available to the system administrator. The UI 350 illustratively comprises both a command line interface (CLI 352) and a graphical user interface (GUI 354) used to implement the vdisk command set to, among other things, create a vdisk, increase/decrease the size of a vdisk and/or destroy a vdisk. The storage space for the destroyed vdisk may then be reused for, e.g., a NAS-based file in accordance with the virtualized storage space feature of the appli-
aiice 100. A vdisk may increase ("grow") or decrease ("shrink") under user control while preserving block and NAS multi-protocol access to its application data.
The UI350 simplifies management of the multi-protocol SAN/NAS storage appliance by, e.g., obviating the need for a system administrator to explicitly configure and specify the disks to be used when creating a vdisk. For instance to create a vdisk, the system administrator need merely issue a vdisk ("lun create") command through, e.g., the CLI 352 or GUI 354. The vdisk command specifies creation of a vdisk (lun), along with the desired size of the vdisk and a path descriptor (pathname) to that vdisk. In response, the file system 320 cooperates with the vdisk module 330 to 'Virtualize" the storage space provided by the underlying disks and create a vdisk as specified by the create command. Specifically, the vdisk module 330 processes the vdisk command to "call" primitive operations ("primitives") in the file system 320 that implement high-level notions of vdisks (luns). For example, the "lun create" command is translated into a series of file system primitives that create a vdisk with correct information and size, as well as at the correct location. These file system primitives include operations to create a file inode (create_file), create a stream inode (createstream), and store information in the stream inode (stream write).
The result of the lun create command is the creation of a vdisk 322 having the specified size and that is RAID protected without having to explicitly specify such protection. Storage of information on disks of the multi-protocol storage appliance is not typed; only "raw" bits are stored on the disks. The file system organizes those bits into vdisks and RAID groups across all of the disks within a volume. Thus, the created vdisk 322 does not have to be explicitly configured because the virtualization system 300 creates a vdisk in a manner that is transparent to the user. The created vdisk inherits high-performance characteristics, such as reliability and storage bandwidth, of the underlying volume created by the file system.
The CLI 352 and/or GUI 354 also interact with the vdisk module 330 to introduce attributes and persistent lun map bindings that assign numbers to the created vdisk. These lun. map bindings are thereafter used to export vdisks as certain SCSI identifiers (IDs) to the clients. In particular, the created vdisk can be exported via a lun mapping technique to enable a SAN client to "view" (access) a disk. Vdisks (luns)
generally require strict controlled access in a SAN environment; sharing of Inns in a SAN environment typically occurs only in limited circumstances, such as clustered file systems, clustered operating systems and multi-pathing configurations. A system administrator of the multi-protocol storage appliance determines which vdislcs (luns) can be exported to a SAN client. Once a vdisk is exported as a lun, the client may access the vdisk over the SAN network utilizing a block access protocol, such as FCP and iSCSl.
SAN clients typically identify and address disks by logical numbers or luns. However, an "case of management" feature of the multi-protocol storage appliance is that system administrators can manage vdisks and their addressing by logical names. To that end, the vdisk module 330 of the multi-protocol storage appliance maps logical names to vdisks. For example when creating a vdisk, the system administrator "right size7' allocates the vdisk and assigns it a name that is generally meaningful to its intended application (e.g., /vol/ol0/database to hold a database). The adnunistrative interface provides name-based management of luns/vdisks (as well as files) exported from the storage appliance on the clients, thereby providing a uniform and unified naming scheme for block-based (as well as file-based) storage.
The multi-protocol storage appliance manages export control of vdisks by logical names through the use of initiator groups (igroups). An igroup is a logical named entity that is assigned to one or more addresses associated with one or more initiators (depending upon whether a clustered environment is configured). An "igroup create" command essentially "binds" (associates) those addresses, which may comprise WWN addresses or iSCSI IDs, to a logical name or igroup. A "lun map" command is then used to export one or more vdisks to the igroup, i.e.} make the vdisk(s) "Visible" to the igroup. In this sense, the "lun map" command is equivalent to an NFS export or a CIFS share. The WWN addresses or iSCSI IDs thus identify the clients that are allowed to access those vdisks specified by the lun map command. Thereafter, the logical name is used with all operations internal to the storage operating system. This logical naming abstraction is pervasive throughout the entire vdisk command set, including interactions between a user and the multi-protocol storage appliance. In particular, the igroup

naming convention is used for all subsequent export operations and listings of luns that are exported for various SAN clients.
Fig., 4 is a schematic flow chart illustrating the sequence of steps involved when accessing ihfonnation stored on the multi-protocol storage appliance over a SAN network. Here, a client communicates with the storage appliance 100 using a block access protocol over a network coupled to the appliance. If the client is client 160a running the Windows operating system, the block access protocol is illustratively the FCP protocol used over the network 185. On the other hand, if the client is client 160b running the UNIX operating system, the block access protocol is illustratively the iSCSI protocol used over network 165. The sequence starts at Step 400 and proceeds to Step 402 where the client generates a request to access information residing on the multiprotocol storage appliance and, in Step 404, the request is forwarded-as a conventional FCP or iSCSI block access request over the network 185,165.
At Step 406, the request is received at network adapter 126,125 of the storage appliance 100, where it is processed by the integrated network protocol stack and passed to the virtualization system 300 at Step 408. Specifically, if the request is a FCP request, it is processed as, e.g., a 4k block request to access (i.e., read/write) data by the FC driver 230. If the request is an iSCSI protocol request, it is received at the media access layer (the Intel gigabit Ethernet) and passed through the TCP/IP network protocol layers to the virtualization system.
Command and control operations, including addressing information, associated with the SCSI protocol are generally directed to disks or luns; however, the file system 320 does not recognize luns. As a result, the SCSI target module 310 of the virtualization system initiates emulation of a lun in order to respond to the SCSI commands contained in the request (Step 410). To that end, the SCSI target module has a set of application programming interfaces (APIs 360) that are based on the SCSI protocol and that enable a consistent interface to both the iSCSI and FCP drivers 228,230. The SCSI target module further implements a mapping/translation procedure that essentially translates a lun into a vdisk. At Step 412, the SCSI target module maps the addressing information, e.g., FC routing information, of the request to the internal structure of the file system.
The file system 320 is illustratively a message-based system; as such, the SCSI taiget module 310 transposes the SCSI request into a message representing an operation directed to the file system. For example, the message generated by the SCSI target module may include a type of operation (e.g., read, write) along with a pathname (e.g., a path descriptor) and a filename (e.g., a special filename) of the vdisk object represented in the file system. The SCSI target module 310 passes the message into the file system layer 320 as, e.g., a function call 365, where the operation is performed.
In response to receiving the message, the file system 320 maps the pathname to inode structures to obtain the file handle corresponding to the vdisk 322. Araied with a file handle, the storage operating system 200 can convert that handle to a disk block and, thus, retrieve the block (inode) from disk. Broadly stated, the file handle is an internal representation of the data structure, i.e., a representation of the inode data structure that is used internally within the file system. The file handle generally consists of a plurality of components including a file ID (inode number), a snapshot ID, a generation ID and a flag. The file system utilizes the file handle to retrieve the special file inode and at least one associated stream inode that comprise the vdisk within the file system structure implemented on the disks 130.
In Step 414, the file system generates operations to load (retrieve) the requested data from disk 130 if it is not resident "in core", i.e., in the memory 124. If the information is not in memory, the file system 320 indexes into the inode file using the inode number to access an appropriate entry and retrieve a logical volume block number (VBN). The file system then passes the logical VBN to the disk storage (RAID) layer 240, which maps that logical number to a disk block number and sends the latter to an appropriate driver (e.g., SCSI) of the disk driver layer 250. The disk driver accesses the disk block number from disk 130 and loads the requested data block(s) in memory 124. In Step 416, the requested data is processed by the virtualization system 300. For example, the data may be processed in connection with a read or write operation directed to a vdisk or in connection with a query command for the vdisk.
The SCSI target module 310 of the virtualization system 300 emulates support for the conventional SCSI protocol by providing meaningful "simulated" information about a requested vdisk. Such information is either calculated by the SCSI target mod-
ule or stored persistently in, e.g., the attributes stream inode of the vdisk. At Step 418, the SCSI target module 310 loads the requested block-based information (as translated from file-based information provided by the file system 320) into a block access (SCSI) protocol message. For example, the SCSI target module 310 may load information, such as the size of a vdisk, into a SCSI protocol message in response to a SCSI query command request Upon completion of the request, the storage appliance (and operating system) returns a reply (e.g., as a SCSI "capacity" response message) to the client over the network (Step 420). The sequence men ends at Step 422.
It should be noted that the software "path" through the storage operating system layers described above needed to perform data storage access for the client request received at the multi-protocol storage appliance may alternatively be implemented in hardware. That is, in an alternate embodiment of the invention, a storage access request data path through the operating system layers (including the virtualization system 300) may be implemented as logic ckcuitry embodied within a field programmable gate array (FPGA) or an application specific integrated circuit (ASIC). This type of hardware implementation increases the performance of the storage service provided by appliance 100 in response to a file access or block access request issued by a client 160. Moreover, in another alternate embodiment of the invention, the processing elements of network and storage adapters 125-128 may be configured to offload some or all of the packet processing and storage access operations, respectively, from processor 122 to thereby increase the performance of the storage service provided by the multi-protocol storage appliance. It is expressly contemplated that the various processes, architectures and procedures described herein can be implemented in hardware, firmware or software.
Advantageously, the integrated multi-protocol storage appliance provides access controls and, if appropriate;, sharing of files and vdisks for all protocols, while preserving data integrity. The storage appliance further provides embedded/integrated virtualization capabilities that obviate the need for a user to apportion storage resources when creating NAS and SAN storage objects. These capabilities include a virtualized storage space that allows SAN and NAS storage objects to coexist with respect to global space management within the appliance. Moreover, the integrated storage appli-
ance provides simultaneous support for block access protocols (iSCSI and FCP) to the same vdisk, as well as a heterogeneous SAN environment with support for clustering. In sum, the multi-protocol storage appliance provides a single unified storage platform for all storage access protocols.
The foregoing description has been directed to specific embodiments, of this invention. It will be apparent, however, that other variations and modifications may be made to the described embodiments, with the attainment of some or all of their advantages. For example, it is expressly contemplated that the teachings of this invention can be implemented as software, including a computer-readable medium having program instructions executing on a computer, hardware, firmware, or a combination, thereof. Accordingly this description is to be taken only by way of example and not to otherwise limit the scope of the invention. It is thus the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.
What is claimed is:




We Claim:
1. A multi protocol storage appliance [100] to provide file protocol
and block protocol access to information stored on storage
devices [130] for both network attached storage (NAS) and
storage area network (SAN) deployments, the appliance
comprising:
a storage operating system (200] to implement a file system [320] cooperating with a virtualization system [300] to virtualize a storage space, the virtual storage space, provided by the storage devices [130], where the virtualization system [300] includes a virtual disk (vdisk) module [330] and a storage network adapter [126];
the storage network adapter [126] provides a mapping procedure that translates logical block access requests into access requests for the virtual storage space; and the vdisk module [330] to provide a data path from the storage network adapter [126] to blocks managed by the file system [320].
2. The multi protocol storage appliance [100] as claimed in claim
1, wherein the file system [320] logically organizes the
information as files [324], directories [326] and virtual disks
[322] (vdisks) to thereby provide an integrated NAS and SAN
appliance approach to storage by enabling file-based access to the files and directories, while further enabling block-based access to the vdisks, each of the vdisks being a special file type that is translated into an emulated disk.
3. The multi protocol storage appliance [100] as claimed in claim 1, wherein the storage network adapter comprises Small Computer Systems Interface (SCSI) target module [310].
4. The multi protocol storage appliance [100] as claimed in claim 3, wherein the vdisk module [330] is layered on the file system
[320].
5. The multi protocol storage appliance [100] as claimed in claim 1, wherein the vdisk module [330] manages the SAN deployments by implementing a set of vdisk commands.
6. The multi protocol storage appliance [100] as claimed in claim 5, wherein the vdisk commands are converted to primitive file system operations that interact with the file system [320] and the SCSI target module [310] to implement the vdisks [322].
The multi protocol storage appliance [100] as claimed in claim
6, wherein the SCSI target module [310] initiates emulation of a
disk or logical unit number (lun) by providing a mapping
procedure that translates the lun into a vdisk [322].
The multi protocol storage appliance [100] as claimed in claim
7, wherein the SCSI target module [310] provides a translation
layer between a SAN block space and a file system space.
The multi protocol storage appliance [100] as claimed in claim 1, wherein the virtualized storage space allows SAN and NAS storage objects to coexist with respect to global space management by the file system [320].
The multi protocol storage appliance [100] as claimed in claim 9, wherein the file system [320] cooperates with the virtualization system [300] to provide reliability guarantees for the SAN and NAS storage objects coexisting in the virtualized storage space.
The multi protocol storage appliance [100] as claimed in claim 1, wherein the file system [320] provides volume management
capabilities for use in block-based access to the information stored on the storage devices [130].
12. The multi protocol storage appliance [100] as claimed in claim 11, wherein the storage devices [130] are disks.
13. The multi protocol storage appliance [100] as claimed in claim 1, wherein the file system [320] provides (i) file system semantics, such as naming of storage objects, and (ii) functions associated with a volume manager.

Documents:

471-delnp-2005-abstract.pdf

471-DELNP-2005-Assignment.pdf

471-DELNP-2005-Claims.pdf

471-delnp-2005-complete specification(as filed).pdf

471-delnp-2005-complete specification(granted).pdf

471-delnp-2005-correspondence-others.pdf

471-delnp-2005-correspondence-po.pdf

471-DELNP-2005-Description (Complete).pdf

471-DELNP-2005-Drawings.pdf

471-delnp-2005-form-1.pdf

471-delnp-2005-form-18.pdf

471-DELNP-2005-Form-2.pdf

471-delnp-2005-form-3.pdf

471-delnp-2005-form-5.pdf

471-delnp-2005-gpa.pdf

471-delnp-2005-pct-101.pdf

471-delnp-2005-pct-210.pdf

471-delnp-2005-pct-220.pdf

471-delnp-2005-pct-301.pdf

471-delnp-2005-pct-304.pdf

471-delnp-2005-pct-402.pdf

471-delnp-2005-pct-409.pdf

471-delnp-2005-pct-416.pdf

471-delnp-2005-petition-137.pdf

471-delnp-2005-petition-138.pdf

abstract.jpg


Patent Number 242256
Indian Patent Application Number 471/DELNP/2005
PG Journal Number 35/2010
Publication Date 27-Aug-2010
Grant Date 20-Aug-2010
Date of Filing 08-Feb-2005
Name of Patentee NETWORK APPLIANCE, INC.,
Applicant Address 495 EAST JAVA DRIVE, SUNNYVALE, CALIFORNIA 94089, UNITED STATES OF AMERICA
Inventors:
# Inventor's Name Inventor's Address
1 BRIAN PAWLOWSKI 1156 HAMILTON, AVENUE, PALO ALTO, CA 94301, USA
2 MOHAN SRINIVASAN 117 CRONIN DRIVE, SANTA CLARA, CA 95051,USA
3 HERMAN LEE 600 RAINBOW DRIVE, #160 MOUNTAIN VIEW,CA 94041,USA
4 VIJAYAN RAJAN 431 COSTA MESA TERRACE #A SUNNYVALE,CA 94085-1612, USA
5 JOSEPH C. PITTMAN 1002 OAKWELL COURT, APEX NC 27502, USA
PCT International Classification Number G06F 12/08
PCT International Application Number PCT/US2003/023597
PCT International Filing date 2003-07-28
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 10/215917 2002-08-09 U.S.A.