Multiple names for one computer - Consolidate your SMB file servers without breaking UNC paths

June 4, 2010, 7:23 pm

≫ Next: Using the multiple NICs of your File Server running Windows Server 2008 (and 2008 R2)

≪ Previous: Windows Server DFS Namespaces (DFS-N) Reference

Overview

This blog post covers a few different ways to consolidate multiple SMB file servers and keep exposing the consolidated file shares under the old share paths.

Scenario

Let’s say you currently have 3 file servers named file1.contoso.local, file2.contoso.local and file3.contoso.local and you want to consolidate them into a single Windows Server computer called cfile.contoso.local. It’s simple enough to use a tool like ROBOCOPY or the File Server Migration Tool (FSMT) to copy your shares and files over to the new box.

You can get additional information about FSMT at the Microsoft File Server Migration Toolkit web page. You can also check my blog post: Microsoft File Server Migration Toolkit 1.2 available as a free download.

The problem here is that users and applications will be unable to use the old server names, like \\file1\Orders\Order1.doc. After the consolidation, that path needs to be changed to use the new server name, like \\cfile\Orders\Order1.doc. However, there are ways to avoid this issue and show the shares under the old names, allowing the old UNC paths to continue to work unchanged.

Option 1: Static entries in DNS

If you use static DNS entries for your file servers, it’s easy enough to simply update the DNS entries for the old file servers (after your old servers are retired), pointing to the IP address of the new server.This means that you would have multiple DNS “A” records (A stands for address) all pointing to the same IP address, which point to the new server.

This is a simple enough solution if you use static DNS entries. You can use the DNS Management MMC to edit the zone, or you can use the DNSCMD command line tool. For instance, assuming your DNS server is at dc1.contoso.local and the IP address of the cfile.contoso.local computer is 192.168.1.11, you could use:

DNSCMD dc1.contoso.local /RecordAdd contoso.local File1 A 192.168.1.11
DNSCMD dc1.contoso.local /RecordAdd contoso.local File2 A 192.168.1.11
DNSCMD dc1.contoso.local /RecordAdd contoso.local File3 A 192.168.1.11

You can find more information about this at the TechNet page about hot to Add a host (A or AAAA) resource record to a zone.

Please note that this option will not work with Kerberos authentication, so this is not a recommended solution.

Option 2: Alternate Computer Names and Dynamic DNS

Another way to do this, if you are running Windows Server, is to add alternate computer names to your new server (after your old servers are retired). This can be done easily by using the NETDOM COMPUTERNAME command.

This command, which works only for Windows Server, allows you to add more names to a computer, in addition to its primary names. You can see details on how to use the command at the TechNet page about Netdom Computername.

For instance, if the domain in the scenario example is contoso.local and the full FQDN for the file server was CFILE.contoso.local, you could add the other names with:

NETDOM COMPUTERNAME cfile /ADD file1.contoso.local
NETDOM COMPUTERNAME cfile /ADD file2.contoso.local
NETDOM COMPUTERNAME cfile /ADD file2.contoso.local
IPCONFIG /registerdns

The last command makes sure the alternate names are properly registered with your DNS server, where other computers in the domain will find it. To check all the names of the computer, primary and alternate, you can use the command:

NETDOM COMPUTERNAME cfile /ENUM

The information about the alternate computer names are kept in the registry, as shown at the TechNet page on DNS Registry Entries.

Option 3: DFS Consolidation Roots

While these first 2 options can preserve the old UNC paths, they do change the original behavior of the 3 file servers. Each one of the 3 used to have a specific set of file shares, but the resulting consolidated file server and each of the three alternate names will have a single set which merges all the original shares.

This might actually be a side effect you welcome or at least can tolerate, but maybe you are required to preserve the exact original behavior, with each of the original names showing a precise subset of the shares. This might be especially troublesome if there were different shares by the same name on the three original servers. Merging them might bring additional trouble, even conflicting file paths.

You can overcome this by using a feature called DFS called consolidation roots. This feature in DFS-Namespaces allows you to map each of the many server names to specific namespaces. This way you can configure exactly what shares should show under each old name. In our case, in addition to having DNS point file1, file2 and file3 to the address of cfile, you would create three namespaces called \\cfile\#file1, \\cfile\#file2 and \\cfile\#file3 and populate them with the right links to the location of your shares.

The FSMT includes a Consolidation Root Wizard that facilitates creating these special types of standalone namespaces. You can get the toolkit from the download page for the Microsoft File Server Migration Toolkit 1.2. You can also learn to how create the consolidation roots manually at the support page about the Distributed File System update to support consolidation roots in Windows Server 2003.

Option 4: Virtual Machines

Another method that’s becoming increasingly popular is using virtualization to consolidate your file servers. Instead of using a new single file server, you create a new single server running Hyper-V with 3 virtual machines each running a file server. In that case, you can preserve the original names (and even the same IP addresses) without having to keep the old hardware around. While that might lead to more machines to manage, that will make your migration much simpler.

System Center Virtual Machine Manager includes the tools to migrate your physical machines to virtual machines (commonly referred to as P2V). For details, see the TechNet page on P2V: Converting Physical Computers to Virtual Machines in VMM. You can also add Failover Clustering to make your virtual machines highly available.

Option 5: Failover Clusters

Speaking of high availability, you have another option using Windows Server Failover Clusters. Consolidated file servers usually can use the extra availability, since you do end up with “many eggs on the same basket”. The process was greatly simplified in Windows Server 2008 and forward, and you can find on Technet a Failover Cluster Step-by-Step Guide: Configuring a Two-Node File Server Failover Cluster

Clustered file servers are created in a way that includes assigning a specific name and IP address to each file service or cluster group. You can use that ability to create multiple names, each with a specific set of shares, potentially mimicking the original configuration of your old file servers.

Starting in Windows Server 2008, file shares are scoped to a specific name, so you naturally get the different sets of shares associated with their specific name, even if they are hosted on the same cluster node. This is explained in detail in this blog post about File Share 'Scoping' in Windows Server 2008 Failover Clusters.

Option 6: Scoped Shares

Speaking of scoped shares, they are another way to address the issue of having different shares depending on the name used to get to the file server. These would work on a standalone file server (not clustered). However, there are no command-line tools or GUI tools to create scoped shares on a standalone file server.

If you are a developer, you might be willing to try create these scoped shares yourself calling the NetShareAdd function. In that case, you need to use level 503. See the remarks on the link for additional information, including the need to use the NetServerTransportAddEx function.

Conclusion

I hope this post has helped you understand your options for consolidating your file servers and keep your old UNC paths in the process. As usual, you should test any procedures in a test environment before deploying it in production. Now let's get started with the planning for those consolidation projects.

↧

Using the multiple NICs of your File Server running Windows Server 2008 (and 2008 R2)

September 3, 2010, 7:07 am

≫ Next: New five-part blog series on DFS Replication

≪ Previous: Multiple names for one computer - Consolidate your SMB file servers without breaking UNC paths

IMPORTANT NOTE: This blog post was created before the release of Windows Server 2012, which introduced SMB 3.0 and the new SMB Multichannel feature and significantly improved SMB's ability to use multiple network interfaces. You can read more about SMB Multichannel at http://blogs.technet.com/b/josebda/archive/2012/06/28/the-basics-of-smb-multichannel-a-feature-of-windows-server-2012-and-smb-3-0.aspx

------------------------------------------------------

1 - Overview

When you set up a File Server, there are advantages to configuring multiple Network Interface Cards (NICs). However, there are many options to consider depending on how your network and services are laid out. Since networking (along with storage) is one of the most common bottlenecks in a file server deployment, this is a topic worth investigating.

Throughout this blog post, we will look into different configurations for Windows Server 2008 (and 2008 R2) where a file server uses multiple NICs. Next, we’ll describe how the behavior of the SMB client can help distribute the load for a file server with multiple NICs. We will also discuss SMB2 Durability and how it can recover from certain network failure in configuration where multiple network paths between clients and servers are available. Finally, we will look closely into the configuration of a Clustered File Server with multiple client-facing NICs.

2 – Configurations

We'll start by examining 8 distinct configurations where a file server has multiple NICs. These are by no means the only possible configurations, but each one has a unique characteristic that is used to introduce a concept on this subject.

2.1 – Standalone File Server, 2 NICs on server, one disabled

This first configuration shows the sad state of many File Servers out there. There are multiple network interfaces available, but only one is actively being used. The other is not connected and possibly disabled. Most server hardware these days does include at least two 1GbE interfaces, but sometimes the deployment planning did not include the additional cabling and configuration to use it. Ironically, a single 1GbE (which provides roughly 100Mbytes per second of throughput) is a common bottleneck for your file server, especially when reading data from cache or from many disk spindles (physical disk throughput is the other most common bottleneck).

Having a single NIC has an additional performance downside if that NIC does not support Receive-side Scaling (RSS). When RSS is not available, a single CPU services all the interrupts from a network adapter. For instance, If you have an 8-core file server using a single non-RSS NIC, that NIC will affinitize to one of the 8 cores, making it even more likely to become a bottleneck. To learn more about deploying RSS, check http://www.microsoft.com/whdc/device/network/NDIS_RSS.mspx.

2.2 – Standalone File Server, 2 NICs on server, teamed

One simple and effective solution for enabling the multiple NICs on a File Server is NIC Teaming, also known as “Link Aggregation” or “Load Balancing and Failover (LBFO)”. These solutions, providers by vendors like Intel, Broadcom and HP, effectively combine multiple physical NICs into one logical or virtual NIC. The details vary based on the specific solution, but most will provide an increase of throughput and also tolerance to the failure of a NIC or to a network cable being accidentally unplugged.

The NIC team typically behaves as a single NIC, requiring only a single IP address. Once you configure the team itself, the Windows Server and File Server configuration proceeds as if you had only one NIC. However, NIC teaming is not something included with Windows Server 2008 or Windows Server 2008 R2. Support for these solutions (the hardware, the drivers and the configuration tools) is provided by the hardware manufacturer.

You can find Microsoft’s support policy for these types of solutions at http://support.microsoft.com/kb/254101 and http://support.microsoft.com/kb/968703.

2.3 – Standalone File Server, 2 NICs on server, single subnet

If you don’t have a NIC teaming solution available but you are configuring a File Server, there are still ways to make it work. You can simply enable and configure both NICs, which will each need their own IP address. If everything is configured properly, both IP addresses will be published to DNS under the name of File Server. The SMB client will then be able to query DNS for the file server name, find that it has multiple IP addresses and choose one of them. Due to DNS round robin, chances are the clients will be spread across the NICs on the file server.

There are several Windows Server components contributing to make this work. First, there’s the fact the File Server will listen on all configured network interfaces. Second, there’s the dynamic DNS that automatically registers all the IP addresses under the server’s name (if configured properly). Third, there’s the fact that DNS will naturally round robin through the different addresses registered under the same name. Last but not least, there is the File Client that will use one of the available IP addresses, giving a priority to the first address on the list (to honor the DNS round robin) but using one of the others if the first one does not respond quickly. The SMB client will use only one of the IP addresses at once. More on this later.

What’s more, due to a feature called SMB2 durability, it’s possible that the SMB client will recover from the failure of a NIC or network path even if it’s right in the middle of reading or writing a file. More on this later as well.

It's important to note that applications other the file server and file client might not behave properly with this configuration (they might not listen on all interfaces, for instance). You might also run into issues updating routing tables, especially in the case of a failure of a NIC or removal of a cable. These issues are documented in KB article 175767. For these reasons, many will not recommend this specific setup with a single subnet.

2.4 – Standalone File Server, 2 NICs on server, multiple subnets

Another possible configuration is for each of the File Server NICs to connect to a different set of clients. This is useful to give you additional overall throughput, since you get traffic coming into both NICs. However, in this case, you are using different subnets. A typical case would be a small company where you have the first floor clients using one NIC and the second floor using the other.

While both of the IP addresses get published to DNS (assuming everything is configured correctly) and each of the SMB clients will learn of both, only one of them will be routable from a specific client. From the SMB client’s perspective, that is fine. If one of them works, you will get connected. However, keep in mind that this configuration won’t give your clients a dual path to the File Server. Each set of clients has only one way to get to the server. If a File Server NIC goes bad or if someone unplugs one of the cables from the File Server, some of your clients will lose access while others will continue to work fine.

2.5 – Standalone File Server, 2 NICs on server, multiple subnets, router

In larger networks, you will likely end up with various networks/subnets for both clients and servers, connected via a router. At this point you probably did a whole lot of planning, your server subnets can easily be distinguished from your client subnets and there’s a fair amount of redundancy, especially on the server side. A typical configuration on the server side would include dual top-of-rack switches on the server side, aggregated to a central switching/routing infrastructure.

If everything is configured properly, the File Servers will have two IP addresses each, both published to the dynamic DNS. From a client perspective, you have a File Server name with multiple IP addresses. The clients here see something similar to what clients see in configuration 2.3, except for the fact that the IP addresses for the client and servers are on different subnets.

It is worth noting that, in this configuration and all the following ones, you could choose to leverage NIC teaming, as described in configuration 2.2, if that is an option available from your hardware vendor. This configuration might bring additional requirements, since each of the NICs in the team are going into a different switch. The configuration of Windows Server itself would be simplified due to the single IP address, although additional teaming configuration with the vendor tool will be added.

2.6 – Standalone File Server, 2 NICs on “clients” and servers, multiple subnets

This last standalone File Server configuration shows both clients and servers with 2 NICs each, using 2 distinct subnets. While this configuration is unusual for regular Windows clients, servers are commonly configured this way for added network fault tolerance. Here are a few examples of such server workloads:

IIS servers that store their HTM and JPG files on file share
SQL Servers that regularly send their database backups to a UNC path
Virtualization Servers that use the file server as a library server for ISO files
Remote Desktop Servers (Terminal Servers) where many users use the file server to store their home folders
SharePoint servers configured to crawl file shares and index them

You can imagine the computers below as part of the configuration 2.5 above, only with more servers to the right of the router this time.

2.7 – Clustered File Servers, 2 NICs on servers, multiple subnets, router

If you are introducing dual network interfaces for fault tolerance, you are also likely interested in clustering. This configuration takes config 2.6 and adds an extra file server to create a failover cluster for your file services. If you are familiar with failover clustering you know that, in addition to the IP addresses required for the cluster nodes themselves, you would need IP addresses for each cluster service (like File Server A and File Service B). More on this later.

Although we’re talking about Clustered File Services with Cluster IP addresses, the SMB clients will essentially see a File Server name with multiple IP addresses for each clustered file service. In fact, the clients here see something similar to configurations 2.3, 2.5 and 2.6. It’s worth noting that, if File Server 1 fails, Failover Clustering will move File Service A to File Server 2, keeping the same IP addresses.

2.8 – Clustered File Server, 2 NICs on “clients” and servers, multiple subnets

This last configuration focuses on file clients that are servers themselves, as described in configuration 2.6. This time, however, the File Servers are clustered. If you are interested in high availability for the file servers, it’s likely you would be also clustering the other server, if the workload allows for it.

3 – Standalone File Server

3.1 – SMB Server and DNS

A Windows Server file server with multiple NICs enabled and configured for dynamic DNS will publish multiple IP addresses to DNS for its name. In the example below, for instance, a server with 3 NICs each in a different subnet is shown. You can see how this got published on DNS. For instance, FS1 shows with 3 DNS A records for 192.168.1.2, 192.168.2.2 and 192.168.3.2. It’s important to know also that the SMB server will listen on all interfaces by default.

3.2 – SMB Client and DNS

From an SMB client perspective, the computers will query DNS for the name of the File Server. In the case of the example above, you will get an ordered list of IP addresses. You can query this information using the NSLOOKUP tool, as shown in the screenshot below.

The SMB client will attempt to connect to all routable IP addresses offered. If more than one routable IP address is available, the SMB client will connect to the first IP address for which a response is received. The first IP address in the list is given a time advantage, to favor the DNS round robin sequence.

To show this in action, I created a configuration where the file server has 3 IP addresses published to DNS. I then ran a script to copy a file to a share in the file server, then flush the DNS cache, wait a few seconds and start over. Below are the sample script and screenshot, showing how the SMB client cycles through the 3 different network interfaces as an effect of DNS round robin.

@ECHO OFF
:LOOP
IPCONFIG /FLUSHDNS
PING FS1 –N 1 | FIND “Reply”
NET USE F:=\\FS1\TEST >NUL
COPY C:\WINDOWS\SYSTEM32\MRT.EXE F:\
NET USE F: /DELETE >NUL
CHOICE /T 30 /C NY /D Y /M "Waiting a few seconds... Continue"
IF ERRORLEVEL 2 GOTO LOOP
:END

3.3 – SMB2 Durability

When the SMB2 client detects a network disconnect or failure, it will try to reconnect. If multiple network paths are available and the original path is now down, the reconnection will use a different one. Durable handles allow the application using SMB2 to continue to operate without seeing any errors. The SMB2 server will keep the handles for a while, so that the client will be able to reconnect to them.

Durable handles are opportunistic in nature and offer no guarantee of reconnections. For durability to occur, the following conditions must be met:

Clients and servers must support SMB2
Durable handles must be used (this is the default for SMB2)
Opportunistic locks (oplocks) must be used (this happens when files are opened) (this is the default for SMB2)

For Windows operating systems, SMB2 is found on Windows Vista, Windows 7 (for client OSes), Windows Server 2008 and Windows Server 2008 R2 (for server OSes). Older versions of Windows will have only SMB1, which does not have the concept of durability.

To showcase SMB2 durability, I used the same configuration shown in the previous screenshot and copied a large number of files to a share. While the copy was going, I disabled Network1, the client network interface that was being used by the SMB client. SMB2 durability kicked in and the SMB client moved to use Network3. I then disabled Network3 and the client started using Network2. You can see in the screenshot below that, with only one of the three interfaces available, the copy is still going.

4 – Clustered File Server

4.1 – Cluster Networks, Cluster Names and Cluster IP Addresses

In a cluster, in addition to the regular node name and IP addresses, you get additional names for the cluster itself and every service (Cluster Group) you create. Each name can have one or more IP addresses. You can add an IP address per public (client-facing) Cluster Network for every Name resource. This includes the Cluster name itself, as shown below:

For each File Server resource, you have a name resource, at least one IP address resource, at least one disk resource and the file server resource itself. See the example below for the File Service FSA, which uses 3 different IP addresses (192.168.1.101, 192.168.2.101 and 192.168.3.101).

Below is a screenshot of the name resource properties, showing the tab where you can add the IP addresses:

4.2 – How Cluster IP addresses are published to DNS

Note that, in the cluster shown in the screenshots, we have 5 distinct names, each of them using 3 IP addreses, since we are using 3 distinct public Cluster Networks. You have the names of the two nodes (FS1 and FS2), the name of the cluster (FS) and the names of the two Cluster File services (FSA and FSB). Here’s how this shows in DNS, after everything is properly configured:

4.3 – Cluster Name and Cluster IP Address dependencies

When your clients are using a File Server resource with multiple routable IP addresses, you should make sure the IP addresses are defined as OR resources, not AND resources, in your dependency definitions. This means that, even if you lose all but one one IP address, the file service is still operational in that node. The default is AND and this will cause the file service to fail over upon the failure of any of the IP addresses, which is typically not desired.

Below you can see the File Service resource with only one of the three IP addresses failed. There is an alert, but the file service is still online and will not failover unless all IP addresses fail.

5. Conclusion

Network planning and configuration plays a major role in your File Server deployment. I hope this blog post has allowed you to consider increasing the throughput and the availability of your File Server by enabling and configuring multiple NICs. I encourage you to experiment with these configurations and features in a lab environment.

↧

New five-part blog series on DFS Replication

September 7, 2010, 8:35 am

≫ Next: What version of SMB2 am I using on my Windows file server?

≪ Previous: Using the multiple NICs of your File Server running Windows Server 2008 (and 2008 R2)

Ned Pyle, a Senior Escalation Support Engineer with the Directory Services team at Microsoft, has just started a new blog series focusing on DFS Replication.

Here’s how he introduces it:

Hello folks, Ned here again to kick off a new five-part series on DFSR. With the release of Windows Server 2008 R2, the warming of economies, and the timing of hardware leases, we have started seeing more questions around replacing servers within existing DFSR Replication Groups. Through the series I will discuss the various options and techniques around taking an existing DFSR replica and replacing some or all of its servers. Depending on your configuration and budget, this can range from a very seamless operation that users will never notice to a planned outage where even their local server may not be available for a period of time. I leave it to you and your accountants to figure out which matters most. This series also gives updated steps on validated pre-seeding to avoid any conflicts and maximize your initial sync performance. I will also speak about new options you have in this replacement cycle for clusters and read-only replication.

Here’s the series index:

Replacing DFSR Member Hardware or OS (Part 1: Planning)
Replacing DFSR Member Hardware or OS (Part 2: Pre-seeding)
Replacing DFSR Member Hardware or OS (Part 3: N+1 Method)
Replacing DFSR Member Hardware or OS (Part 4: Disk Swap)
Replacing DFSR Member Hardware or OS (Part 5: Reinstall and Upgrade)

You can read Part 1 right now at http://blogs.technet.com/b/askds/archive/2010/09/03/replacing-dfsr-member-hardware-or-os-part-1-planning.aspx

↧

What version of SMB2 am I using on my Windows file server?

October 26, 2010, 11:43 am

≫ Next: The Basics of SMB Signing (covering both SMB1 and SMB2)

≪ Previous: New five-part blog series on DFS Replication

Note: This post is now obsolete. Please refer to this newer post which includes coverage of SMB 3.0:
http://blogs.technet.com/b/josebda/archive/2012/06/06/windows-server-2012-which-version-of-the-smb-protocol-smb-1-0-smb-2-0-smb-2-1-or-smb-3-0-you-are-using-on-your-file-server.aspx

I recently talked to a customer that was surprised to hear that their Windows 7 clients were not using the latest version of SMB2 to talk with their Windows Server 2003 file servers.

I explained to him that, in order to use SMB2, both sides of the connection have to support it. If not, they will negotiate down to the highest version that both support.

I also explained that Windows actually uses 2 different versions of SMB2:

SMB2 (technically SMB2 version 2.002) which is the version on Windows Vista SP1 (or later SP) and Windows Server 2008 (or any SP)
SMB2.1 (technically SMB2 version 2.1) which is the version on Windows 7 (or any SP) and Windows Server 2008 R2 (or any SP)

However, all versions offer the ability to negotiate the SMB client and server capabilities and they will talk to older versions at their level. This “negotiate” process happens automatically and it is transparent to end users and applications.

Here’s a table to help you understand what version you end up using, depending on what Windows client version is talking to what Windows Server version:

Client / Server OS	Previous versions of Windows	Windows Vista SP1+ Windows Server 2008	Windows 7 Windows Server 2008 R2
Previous versions of Windows	SMB 1	SMB 1	SMB 1
Windows Vista SP1+ Windows Server 2008	SMB 1	SMB2 (v2.002)	SMB2 (v2.002)
Windows 7 Windows Server 2008 R2	SMB 1	SMB2 (v2.002)	SMB2.1

If you don’t know what changed from SMB1 to SMB2, I recommend that you read this blog post:
http://blogs.technet.com/b/josebda/archive/2008/12/09/smb2-a-complete-redesign-of-the-main-remote-file-protocol-for-windows.aspx

For details on what changed from SMB2 to SMB2.1, you can check this deck from SNIA’s Storage Developer’s Conference, delivered by David Kruse, Microsoft’s Developer Lead on the SMB2 team:
http://www.snia.org/events/storage-developer2009/presentations/tuesday/DavidKruse_SMBv21.pdf

Note 1: If you consider yourself an SMB2 geek and you actually want to understand the SMB NEGOTIATE command in greater details, you can read the MS-SMB2 protocol documentation:

1.7 - Versioning and Capability Negotiation - http://msdn.microsoft.com/en-us/library/cc246492.aspx
2.2.3 - SMB2 NEGOTIATE Request - http://msdn.microsoft.com/en-us/library/cc246543.aspx
2.2.4 - SMB2 NEGOTIATE Response - http://msdn.microsoft.com/en-us/library/cc246561.aspx
3.3.5.4 - Receiving an SMB2 NEGOTIATE Request - http://msdn.microsoft.com/en-us/library/cc246768.aspx

Note 2: During the recent SNIA CIFS/SMB/SMB2 PlugFest, the T-shirt shown below was handed to every attendee. It’s a play on a diagram from the MS-SMB2 protocol documentation, with a few “customizations” from the original version.

clip_image002

↧

The Basics of SMB Signing (covering both SMB1 and SMB2)

December 1, 2010, 1:40 pm

≫ Next: Currently available hotfixes for SMB and SMB2 File Server components in Windows Server 2008 and Windows Server 2008 R2

≪ Previous: What version of SMB2 am I using on my Windows file server?

SMB Signing Overview

Server Message Block (SMB) is the file protocol most commonly used by Windows. SMB Signing is a feature through which communications using SMB can be digitally signed at the packet level. Digitally signing the packets enables the recipient of the packets to confirm their point of origination and their authenticity. This security mechanism in the SMB protocol helps avoid issues like tampering of packets and “man in the middle” attacks.

SMB signing is available in all currently supported versions of Windows, but it’s only enabled by default on Domain Controllers. This is recommended for Domain Controllers because SMB is the protocol used by clients to download Group Policy information. SMB signing provides a way to ensure that the client is receiving genuine Group Policy.

SMB signing was introduced in Windows 2000 (at the time it was also ported back to Microsoft Windows NT 4.0 and Microsoft Windows 98). With the introduction of SMB2 in Windows Vista and Windows Server 2008, signing was improved by using a new hashing algorithm (HMAC SHA-256 replaced the old MD5). At that time, the settings were updated to simplify configuration and interoperability (you can find details later in the post). Another important improvement in SMB2 signing is performance. In SMB1, enabling signing significantly decreases performance, especially when going across a WAN. In SMB2, there is almost no measurable degradation in performance, although there is still a higher CPU load.

SMB1 Signing Configuration and Defaults

There are two main ways to configure signing for SMB1 clients and SMB1 servers. The easier one is set a Group Policy to configure it. This is, for instance, how domain controllers are configured by default to require signing. The other way to do it is using registry settings. On each side (SMB1 client and SMB1 server), SMB1 Signing can be set to be “Required”, “Enabled” or “Disabled”.

Here’s a summary of the SMB1 Client signing settings:

Setting	Group Policy Setting	Registry Keys
Required	Digitally sign communications (always) – Enabled	RequireSecuritySignature = 1
Enabled*	Digitally sign communications (if server agrees) – Enabled	EnableSecuritySignature = 1, RequireSecuritySignature = 0
Disabled	Digitally sign communications (if server agrees) – Disabled	EnableSecuritySignature = 0, RequireSecuritySignature = 0

Here’s a summary of SMB1 Server signing settings:

Setting	Group Policy Setting	Registry Keys
Required***	Digitally sign communications (always) – Enabled	RequireSecuritySignature = 1
Enabled	Digitally sign communications (if client agrees) – Enabled	EnableSecuritySignature = 1, RequireSecuritySignature = 0
Disabled **	Digitally sign communications (if client agrees) – Disabled	EnableSecuritySignature = 0, RequireSecuritySignature = 0

* The default setting for signing on SMB1 Clients is “Enabled”.
** The default setting for signing on SMB1 Servers is “Disabled”.
*** The default setting for signing on Domain Controllers (defined via Group Policy) is “Required”.

The Group Policy settings are found under Computer Configuration\Windows Settings\Security Settings\Local Policies\Security Options.
Client registry keys are stored under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanWorkStation\Parameters.
Server registry keys are stored under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanServer\Parameters.
All registry keys are of type DWORD.

SMB2 Signing Configuration and Defaults

SMB2 simplified this configuration by having only one setting: whether signing was required or not. This can be configured via Group Policy or registry setting, on SMB2 clients and SMB2 servers. On each side, signing can be set to be “Required” or “Not Required”.

Here’s a summary of the SMB2 client and SMB2 server signing settings:

Setting	Group Policy Setting	Registry Key
Required *	Digitally sign communications (always) – Enabled	RequireSecuritySignature = 1
Not Required **	Digitally sign communications (always) – Disabled	RequireSecuritySignature = 0

* The default setting for signing on a Domain Controller (defined via Group Policy) is “Required”.
** The default setting for signing on SMB2 Servers and SMB Clients is “Not Required”.

The Group Policy setting is found under Computer Configuration\Windows Settings\Security Settings\Local Policies\Security Options.
Client registry key is stored under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanWorkStation\Parameters.
Server registry key is stored under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanServer\Parameters.
All registry keys are of type DWORD.

SMB Signing Effective Behavior

There is a negotiation done between the SMB client and the SMB server to decide whether signing will effectively be used.

Here’s a summary of the effective behavior for SMB2:

	Server – Required	Server – Not Required
Client – Required	Signed*	Signed
Client – Not Required	Signed	Not Signed**

Here’s a summary of the effective behavior for SMB1 in current versions of Windows:

	Server – Required	Server – Enabled	Server – Disabled
Client – Required	Signed	Signed	Signed
Client – Enabled	Signed*	Signed	Not signed**
Client – Disabled	Signed	Not Signed	Not Signed

* Default for Domain Controller SMB traffic.
** Default for all other SMB traffic.

Older SMB1 Signing Behavior

A common source of confusion around SMB1 signing is the fact that older versions of Windows had a different signing behavior. That behavior was changed in 2008 to match the behavior of Windows Server 2008 and Windows Vista as documented at http://support.microsoft.com/kb/950876. Here’s a summary of the effective behavior for early versions of Windows Server 2003 and Windows XP (or older):

	Old Server – Required	Old Server – Enabled	Old Server – Disabled
Old Client – Required	Signed	Signed	Fails to connect
Old Client – Enabled	Signed*	Signed	Not signed**
Old Client – Disabled	Fails to connect	Not Signed	Not Signed

* Default for Domain Controller SMB1 traffic.
** Default for all other SMB1 traffic.

If you have an old SMB1 server or old SMB1 client, you should have it patched or updated to remove the possibility of failures to connect in a misconfigured environment.

Changing the SMB signing behavior

In general, it is recommended that you keep the default SMB signing settings. However, customers sometimes want to reconfigure SMB signing in specific situations. For instance, the customer could have the need to:

Increase SMB performance in Domain Controllers. It’s true that SMB signing will require additional processing for hash calculation, so you could increase a domain controller SMB performance by disabling the “Required” setting on Domain Controllers. However, we strongly discourage changing the default, since it will also expose your Group Policy to tampering and man-in-the-middle attacks.
Allow the use of WAN ‘optimization’ devices to speed up traffic SMB traffic between branch offices and head office by disabling the “Required” setting on Domain Controllers. Again, you’re trading performance for security. Although these these devices could be legitimate, they essentially behave as a broker and would be in the position to relay obsolete group policy settings or even tampered ones (if compromised).
Increase the security for SMB clients or SMB servers that are not Domain Controllers. By enabling the “Required” setting on SMB clients or SMB server, you could force all SMB traffic to be signed. Signing all SMB traffic is not recommended because it will require additional processing (for hash calculation) and will decrease SMB performance.

If you decide that you must change the SMB signing settings, the recommendation is to use the “Digitally sign communications (always)” Group Policy setting. If you cannot do it via Group Policy, you could use the “RequireSecuritySignature” registry setting.

IMPORTANT: We no longer recommend using “Digitally sign communications (if client agrees)” or “Digitally sign communications (if server agrees)” Group Policy settings. We also no longer recommend using the “EnableSecuritySignature” registry settings. These options, which only affect the SMB1 behavior, can be effectively replaced by the “Digitally sign communications (always)” Group Policy setting or the “RequireSecuritySignature” registry setting.

References

Here are a few Knowledge Base articles (support) and TechNet articles that provide additional details on SMB signing. Please be careful interpreting these references, since some of them refer to the older SMB1 behavior.

http://support.microsoft.com/kb/887429 - Overview of Server Message Block signing for older versions of OS
http://support.microsoft.com/kb/916846 - Mismatched SMB signing in Group Policy or in the registry
http://support.microsoft.com/kb/950876 - Windows Server 2003 and Windows XP fix to match Windows Server 2008 or Windows Vista SP1 signing.
http://technet.microsoft.com/en-us/library/cc728025.aspx - Group Policy: Microsoft network client: Digitally sign communications (always)
http://technet.microsoft.com/en-us/library/cc785861.aspx - Group Policy: Microsoft network client: Digitally sign communications (if server agrees)
http://technet.microsoft.com/en-us/library/cc786681.aspx - Group Policy: Microsoft network server: Digitally sign communications (always)
http://technet.microsoft.com/en-us/library/cc759474.aspx - Group Policy: Microsoft network server: Digitally sign communications (if client agrees)
http://technet.microsoft.com/en-us/library/cc512612.aspx - How to Shoot Yourself in the Foot with Security, Part 1

↧

Currently available hotfixes for SMB and SMB2 File Server components in Windows Server 2008 and Windows Server 2008 R2

January 31, 2011, 10:45 pm

≫ Next: FSCT test results detail the performance of Windows Server 2008 R2 File Server configurations - 23,000 users with 192 spindles

≪ Previous: The Basics of SMB Signing (covering both SMB1 and SMB2)

If you’re installing a Windows Server 2008 or Windows Server 2008 R2 file server, we always recommend getting the latest hotfixes from Windows Update. That will include all security updates and updates that are considered important enough to be delivered to all Windows Server users. However, there is a class of hotfixes there not pushed via Windows Update. Typically, you won’t get those hotfixes until in the next Service Pack or if you run into the specific issues they address and call Microsoft Support.

As you can imagine, these hotfixes do undergo a more narrow testing, typically focused on resolving specific issues reported by customers. The Service Pack process includes a much broader test process, focused on the complex interactions between multiple hotfixes from different components.

Although this is not the recommended procedure for someone interested in the most stable version of the components, some users prefer to apply the latest version of the SMB and SMB2 file server bits when installing a a server, especially if they have the opportunity to perform extensive tests before going live with the server. However, there used to be no easy way to tell which are the latest non-critical hotfixes, unless you were tracking each individual hotfix that got published at http://support.microsoft.com.

Well, the good news is that the bright folks in Microsoft Support that work specifically with File Services for Windows Server have put together a Knowledge Base article (or a KB) that gives you the latest hotfixes for the SMB and SMB2 file server components. Not only that, they will be keeping that KB updated as new service packs and hotfixes get publish. You see, they are always on top of all the hotfixes being released for their components.

If you’d rather wait for the Service Pack for the less critical updates, that’s fine. However, if you want to install the latest version of the components related to the SMB and SMB2 file server (including SRV, MRXSMB and RDBSS), you now have a single place to look them up. And, as usual, perform all the proper testing before deploying to production. Find that new KB at http://support.microsoft.com/?id=2473205.

↧

FSCT test results detail the performance of Windows Server 2008 R2 File Server configurations - 23,000 users with 192 spindles

April 8, 2011, 4:36 pm

≫ Next: Using 4k sector and advanced format drives in Windows. HotFix and support info for Windows Server 2008 R2 and Windows 7

≪ Previous: Currently available hotfixes for SMB and SMB2 File Server components in Windows Server 2008 and Windows Server 2008 R2

1. Introduction

The File Server Capacity Tool (FSCT) is a free download from Microsoft that helps you determine the capacity of a specific file server configuration (running Windows or any operating system that implements the SMB or SMB2 protocols). It simulates a specific set of operations (the “Home Folders” workload) being executed by a large number of users against the file server, confirming the ability of that file server to perform the specified operations in a timely fashion. It makes it possible to verify, for instance, if a specific file server configuration can handle 10,000 users. In case you’re not familiar with FSCT’s “Home Folders Workload”, it simulates a standard user’s workload based on Microsoft Office, Windows Explorer, and command-line usage when the file server is the location of the user’s home directory.

We frequently use FSCT internally at Microsoft. In fact, before being released publicly, the tool was used to verify if a specific change to the Windows code has any significant performance impact in a file server scenario. We continue use FSCT for that purpose today.

Recently, the File Server Team released a document (available at http://www.microsoft.com/downloads/en/details.aspx?FamilyID=89a73dd0-ed31-4cc2-aa7d-2fded8a023ab) with results from a series of FSCT tests. These tests were performed in order to quantify the file server performance difference between Windows Storage Server 2008 (based on Windows Server 2008) and Windows Server 2008 R2. It was also an exercise to analyze the capacity (in terms of FSCT “Home Folders” users) of some common File Server configurations using between 24 and 192 disks.

2. Comparing Windows Server 2008 and Windows Server 2008 R2 with 24 spindles

The document includes details about how the tests were performed, what specific hardware configurations were used and what was the CPU, memory, disk and network utilization in each case. It organizes the results by operating system, showing results for all Windows Storage Server 2008 (based on Windows Server 2008) configurations, then the results of all Windows Server 2008 R2. However, I find it even more interesting to compare two identical hardware configurations running the two different versions of Windows. You can clearly see how the software improved over time. For instance, you see below how a 24-spindle configuration went from supporting 4,500 FSCT users to supporting 7,500 FSCT users. Note how Windows Server 2008 R2 was able to squeeze more out of the server, with increased CPU, memory, disk and network utilization:

FSCT Test Results	TESTBED-C [24 HDD – R10]	TESTBED-F [24 HDD – R10]
Max users supported	4,500	7,500
CPU utilization	12%	28%
Memory utilization	34%	65%
Disk utilization	106 MB/sec	193 MB/sec
Network utilization	114 MB/sec	208 MB/sec
Test date	05/02/2010	02/21/2010
Hardware Configuration	TESTBED-C [24 HDD – R10]	TESTBED-F [24 HDD – R10]
Platform	White box Hardware	White box Hardware
Operating system	Windows Server 2008 *	Windows Server 2008 R2
Processor	(1) Intel X5560 (2.8GHz)	(1) Intel X5560 (2.8GHz)
Memory	16GB	16 GB
Disk drives	(24) 72GB SFF SAS 15K	(24) 72GB SFF SAS 15K
LUNs	(2) x 12 HDD (RAID-10)	(2) x 12 HDD (RAID-10)
Disk array	(1) FC array	(1) FC array
Disk controller	(1) Dual port 8Gb FC HBA	(1) Dual port 8Gb FC HBA
Network adapters	(1) 10GbE	(1) 10GbE

* This is actually Windows Storage Server 2008, which is built on Windows Server 2008.

This table provides an interesting snapshot of many items that matter to capacity planning. For instance, you can see how we’re not really hitting bottleneck on CPU, storage or network. My conclusion here is that we’re bound by the random access performance of the individual drives (random IOPs) and we would need to add more spindles to achieve more users per server. If your goal is to provide a “Home Folders” file service to around 5,000 users and want to save money, you could go the other way and decide to tweak TESTBED-F and use a system with less RAM (since we’re not hitting that) or even configure the system with dual 1GbE network interfaces instead of 10GbE (since dual 1GbE can provide you with a around 220MB/sec). However, if you do want to change the configuration, you would need to run the tests again, since there could be other interactions when you change the hardware like that.

3. Comparing Windows Server 2008 and Windows Server 2008 R2 with 96 spindles

In a similar fashion, a 96-spindle configuration went from supporting 9,500 FSCT users to an impressive 16,500 FSCT users. Again, nothing was changed in the hardware to achieve that improvement. It was just a matter of going from Windows Storage Server 2008 (based on Windows Server 2008) to Windows Server 2008 R2 (and effectively using SMB2 version 2.1 instead of SMB2 version 2.0).

FSCT Test Results	TESTBED-A [96 HDD – R10]	TESTBED-E [96 HDD – R10]
Max users supported	9,500	16,500
CPU utilization	16%	48%
Memory utilization	37%	17%
Disk utilization	238 MB/sec	419 MB/sec
Network utilization	260 MB/sec	457 MB/sec
Test date	05/03/2010	02/15/2010
Hardware Configuration	TESTBED-A [96 HDD – R10]	TESTBED-E [96 HDD – R10]
Platform	White box Hardware	White box Hardware
Operating system	Windows Server 2008 *	Windows Server 2008 R2
Processor	(2) Intel X5560 (2.8GHz)	(2) Intel X5560 (2.8GHz)
Memory	32GB	72 GB
Disk drives	(96) 72GB SFF SAS 15K	(96) 72GB SFF SAS 15K
LUNs	(8) x 12 HDD (RAID-10)	(8) x 12 HDD (RAID-10)
Disk array	(1) FC array + (3) enclosures	(1) FC Array + (3) Enclosures
Disk controller	(2) Dual port 8Gb FC HBA	(2) Dual Port 8Gb FC HBA
Network adapters	(1) 10GbE	(1) 10GbE

* This is actually Windows Storage Server 2008, which is built on Windows Server 2008.

Again, you would need to look deep to understand your bottleneck here. While FSCT will provide you with a lot of performance counters, you need a human to figure out what is holding you back. Clearly it’s not memory or CPU. Your network also is not at max capacity yet (in theory, you could hit at least twice what is being used by the TESTBED-E using 10GbE). So, again, the bottleneck here has to be the storage. As I mentioned before, If your goal is to configure a system to provide service to around 10,000 users, you could probably play with TESTBED-E’s configuration a bit (use less memory, use just one processor instead of two, reduce the number of disks) to shrink the overall acquisition cost a little while keeping the performance at a good level for that number of users. Again, you would need to rerun FSCT with that new configuration to be sure.

4. Running Windows Server 2008 R2 with 192 spindles

The document also includes a 192-spindle configuration using Windows Server 2008 R2. This is one of the most impressive FSCT results I have ever seen. In this test, a single file server was able to successfully handle 23,000 FSCT users running the “Home Folders” workload simultaneously. I wonder if you could find a similar NAS appliance configuration out there able to handle this number of FSCT users... Here are the results:

FSCT Test Results	TESTBED-D [192 HDD – R0]
Max users supported	23,000
CPU utilization	63%
Memory utilization	23%
Disk utilization	601 MB/sec
Network utilization	650 MB/sec
Test date	02/14/2010
Hardware Configuration	TESTBED-D [192 HDD – R0]
Platform	White box Hardware
Operating System	Windows Server 2008 R2
Processor	(2) Intel X5560 (2.8GHz)
Memory	72 GB
Disk drives	(192) 72GB SFF SAS 15K
LUNs	(16) x 12 HDD (RAID-0)
Disk array	(2) FC Array + (6) Enclosures
Disk controller	(4) Dual Port 8Gb FC HBA
Network adapter	(2) 10GbE

In this configuration, it is much harder find the bottleneck. We have a good amount of free memory, but we’re hitting a fairly high CPU utilization for a file server workload. Both the storage and the network are fairly busy as well at around 600 MB/sec. Also note that we’re using RAID-0 here, so this configuration is not realistic for a production deployment.

5. Charts and Diagrams

Each of the configurations includes also a chart with the throughput (in FSCT scenarios per second), CPU utilization and total number of FSCT users the configuration can handle, as you can see below. These charts were created using Microsoft Excel and the text results provided by FSCT. For example, here’s the chart for the 192-spindle configuration:

The document also provides information about the hardware used in each of the configurations, including disks, arrays, storage fabric, server, network and clients used to generate the load. There is enough information there to allow you to reproduce the tests in your own environment or lab. For instance, here’s a diagram of the 192-spindle configuration:

6. Table of Contents

This blog post provides just a sample of the information contained in the document. Here is the full table of contents:

Overview
- FSCT Terminology
- Server Tuning Information
- Windows Storage Server 2008
- Windows Server 2008 R2
TESTBED-A (WSS08, Dual Socket, 32GB RAM, (96) SAS 15K HDD, RAID-10)
- FSCT Test Results (9500 users with 16% CPU utilization)
- Hardware Configuration
TESTBED-B [WSS08, Dual Socket, 16GB RAM, (48) SAS 15K HDD, RAID-10]
- FSCT Test Results (6500 users with 11% CPU utilization)
- Hardware Configuration
TESTBED-C [WSS08, Single Socket, 16GB RAM, (24) SAS 15K HDD, RAID-10]
- FSCT Test Results (4500 users with 12% CPU utilization)
- Hardware Configuration
TESTBED-D [W2K8R2, Dual Socket, 72GB RAM, (192) SAS 15K HDD, RAID-0]
- FSCT Test Results (23000 users with 63% CPU utilization)
- Hardware Configuration
TESTBED-E [W2K8R2, Dual Socket, 72GB RAM, (96) SAS 15K HDD, RAID-10]
- FSCT Test Results (16500 users with 48% CPU utilization)
- Hardware Configuration
TESTBED-F [W2K8R2, Single Socket, 16GB RAM, (24) SAS 15K HDD, RAID-10]
- FSCT Test Results (7500 users with 28% CPU utilization)
- Hardware Configuration
Conclusion
References

7. Conclusion

As you can see, the document is rich in detail. If your work is related to planning, sizing or configuring file servers, it could be very useful.

I would highly recommend downloading the full document from http://www.microsoft.com/downloads/en/details.aspx?FamilyID=89a73dd0-ed31-4cc2-aa7d-2fded8a023ab

I would also encourage you to experiment with FSCT yourself. You can start at http://blogs.technet.com/b/josebda/archive/2009/09/16/file-server-capacity-tool-fsct-1-0-available-for-download.aspx

↧

Using 4k sector and advanced format drives in Windows. HotFix and support info for Windows Server 2008 R2 and Windows 7

April 26, 2011, 2:34 pm

≫ Next: TechEd 2011 Session WSV317: Windows Server 2008 R2 File Services Consolidation - Technology Update

≪ Previous: FSCT test results detail the performance of Windows Server 2008 R2 File Server configurations - 23,000 users with 192 spindles

If you work with storage, you probably already heard about the “4K Sector Drives”, “Advanced Format Drives” and “512e drives”. These new “4K sector drives” abandon the traditional use of 512 bytes per sector in favor of a new structure that uses 4096 bytes. The migration to the new formats is eased by the use of 4K drives that simulate the old format, known as “512 Emulation Drives” or “512e Drives” or Advanced Format Drives”.

Native 4K sector drives are currently not supported with Windows. However, 512e drives (or Advanced Format Drives) are supported with recent versions of Windows, provided that you follow the guidance in the following support article: http://support.microsoft.com/kb/2510009. There are specific requirements to be met and specific details for different Microsoft applications like Hyper-V, SQL Server and Exchange Server.

For Windows 7 and Windows Server 2008 R2, the KB article above mentions the requirement to install a specific hotfix described at http://support.microsoft.com/kb/982018. Please note that most of this fix is part of Windows 7 Service Pack 1 (SP1) or Windows Server 2008 R2 SP1, except for updates to the FSUTIL tool.

For you developers, head on over to MSDN to read on the nitty gritty details of this storage transition, and how it may impact your applications. Details are published at http://msdn.microsoft.com/en-us/library/hh182553.aspx.

If you’re interested in these new 4K sector drives, you might also want to look at these other links:

Note: The updated version of FSUTIL is available as a download from the support KB page and, since 4/26/2011, via Windows Update labeled as "Update for Windows 7 (KB982018)".

-----

After I posted this blog, MikeH asked on FileCab: Is there any way I can figure out if the installed drive uses 4K or emulation mode?

Answer: You can recognize "Advanced Format" drives (also known as 512e or 512 emulation) by using FSUTIL FSINFO NTFSINFO <drive> and looking at the "Bytes per Sector" and "Bytes Per Physical Sector". Those drives will show 512 bytes per sector but 4096 (4K) bytes per physical sector. For more details, read the section titled "Issue 6" at http://support.microsoft.com/kb/982018.

↧

TechEd 2011 Session WSV317: Windows Server 2008 R2 File Services Consolidation - Technology Update

May 16, 2011, 4:35 am

≫ Next: Windows Server 2012 R2: Which version of the SMB protocol (SMB 1.0, SMB 2.0, SMB 2.1, SMB 3.0 or SMB 3.02) are you using?

≪ Previous: Using 4k sector and advanced format drives in Windows. HotFix and support info for Windows Server 2008 R2 and Windows 7

This week at TechEd 2011 I will be delivering a presentation about “Windows Server 2008 R2 File Services Consolidation - Technology Update”.

If you’re attending the conference in Atlanta-GA, this is session WSV317 on Wednesday at 10:15 AM, plus a repeat on Thursday also at 10:15 AM.

The presentation is divided into 5 main topics:

Overview of the main advances in File Services in Windows Server 2008 R2, compared to Windows Server 2003/2008
Examination of 3 interesting file server configurations with 24, 96 and 192 disks using the File Server Capacity Tool (FSCT)
How to consolidate file server names, a common issue when retiring multiple file servers into a single, beefier one
Leveraging multiple network interfaces on a file server to provide additional bandwidth and network fault tolerance
Implementing High Availability for File Services using Failover Clustering and Virtualization

Here’s a complete outline of the presentation:

Agenda
- Session Objectives
- Scenario Overview
File Server Scalability and Performance
- Improvements since Windows Server 2003
  - SMB2 and SMB 2.1 improvements
  - Make sure you’re running the right version…
  - CHKDSK Improvements
  - 8.3 naming disabling and stripping
  - DFS Namespace Scalability
  - Multi-threaded ROBOCOPY
  - Scalability Improvement Over Time
- Sample Configuration – 24 spindles
- Sample Configuration – 96 spindles
- Sample Configuration – 192 spindles!
File Server Name Consolidation
- The name consolidation problem
- Static DNS Entries
- Alternate Computer Names and Dynamic DNS
- DFS Consolidation Roots
- Virtual Machines
- Failover Clusters
File Server Advanced Networking
- DNS Round Robin
- SMB2 Durability
- Multiple IP addresses per cluster name
- NIC Teaming
- Sample Multi-NIC File Server Configurations
  - Standalone, single switch, single client NIC – 2nd NIC disabled
  - Standalone, single switch, single client NIC – NIC teaming
  - Standalone, single switch, single client NIC – same subnet
  - Standalone, multiple switches, single client NIC
  - Standalone, router, single client NIC
  - Standalone, multiple switches, multiple client NICs
  - Cluster, router, single client NIC
  - Cluster, multiple switches, multiple client NICs
File Server High Availability
- Multi-site DFS and Offline Files
- Single-site DFS
- Cluster - Active/Passive vs. Multi-Active
- File Server Cluster – FC SAN
- File Server Cluster – SAS Array
- File Server Cluster – iSCSI SAN
- Virtual File Server – DFS
- Virtual File Server, Host Cluster
- Virtual File Server, Guest Cluster
Review: Session Objectives

The two demos include SMB2 durability, SQL over SMB2, the Microsoft iSCSI Software Target and a Failover Cluster with Windows Server 2008 R2 SP1 File Services.

Looking forward to seeing you there... And also at the Windows Server booth for File Services (WSV 13).

P.S.: You can now listen to a recording of this presentation at http://channel9.msdn.com/Events/TechEd/NorthAmerica/2011/WSV317
I also posted information about the demo used in this presentation at http://blogs.technet.com/b/josebda/archive/2011/05/19/teched-2011-demo-install-step-by-step-hyper-v-ad-dns-iscsi-target-file-server-cluster-sql-server-over-smb2.aspx

↧

Windows Server 2012 R2: Which version of the SMB protocol (SMB 1.0, SMB 2.0, SMB 2.1, SMB 3.0 or SMB 3.02) are you using?

October 2, 2013, 3:50 pm

≫ Next: The Deprecation of SMB1 – You should be planning to get rid of this old SMB dialect

≪ Previous: TechEd 2011 Session WSV317: Windows Server 2008 R2 File Services Consolidation - Technology Update

Note: This blog post is a Windows Server 2012 R2 update on a previous version focused on Windows Server 2012.

1. Introduction

With the release of Windows 8.1 and Windows Server 2012 R2, I am frequently asked about how older versions of Windows will behave when connecting to or from these new versions. Upgrading to a new version of SMB is something that happened a few times over the years and we established a process in the protocol itself by which clients and servers negotiate the highest version that both support.

2. Versions

There are several different versions of SMB used by Windows operating systems:

CIFS – The ancient version of SMB that was part of Microsoft Windows NT 4.0 in 1996. SMB1 supersedes this version.
SMB 1.0 (or SMB1) – The version used in Windows 2000, Windows XP, Windows Server 2003 and Windows Server 2003 R2
SMB 2.0 (or SMB2) – The version used in Windows Vista (SP1 or later) and Windows Server 2008
SMB 2.1 (or SMB2.1) – The version used in Windows 7 and Windows Server 2008 R2
SMB 3.0 (or SMB3) – The version used in Windows 8 and Windows Server 2012
SMB 3.02 (or SMB3) – The version used in Windows 8.1 and Windows Server 2012 R2

Windows NT is no longer supported, so CIFS is definitely out. Windows Server 2003 R2 with a current service pack is under Extended Support, so SMB1 is still around for a little while. SMB 2.x in Windows Server 2008 and Windows Server 2008 R2 are under Mainstream Support until 2015. You can find the most current information on the support lifecycle page for Windows Server. The information is subject to the Microsoft Policy Disclaimer and Change Notice. You can use the support pages to also find support policy information for Windows XP, Windows Vista, Windows 7 and Windows 8.

In Windows 8.1 and Windows Server 2012 R2, we introduced the option to completely disable CIFS/SMB1 support, including the actual removal of the related binaries. While this is not the default configuration, we recommend disabling this older version of the protocol in scenarios where it’s not useful, like Hyper-V over SMB. You can find details about this new option in item 7 of this blog post: What’s new in SMB PowerShell in Windows Server 2012 R2.

3. Negotiated Versions

Here’s a table to help you understand what version you will end up using, depending on what Windows version is running as the SMB client and what version of Windows is running as the SMB server:

OS	Windows 8.1 WS 2012 R2	Windows 8 WS 2012	Windows 7 WS 2008 R2	Windows Vista WS 2008	Previous versions
Windows 8.1 WS 2012 R2	SMB 3.02	SMB 3.0	SMB 2.1	SMB 2.0	SMB 1.0
Windows 8 WS 2012	SMB 3.0	SMB 3.0	SMB 2.1	SMB 2.0	SMB 1.0
Windows 7 WS 2008 R2	SMB 2.1	SMB 2.1	SMB 2.1	SMB 2.0	SMB 1.0
Windows Vista WS 2008	SMB 2.0	SMB 2.0	SMB 2.0	SMB 2.0	SMB 1.0
Previous versions	SMB 1.0	SMB 1.0	SMB 1.0	SMB 1.0	SMB 1.0

* WS = Windows Server

4. Using PowerShell to check the SMB version

In Windows 8 or Windows Server 2012, there is a new PowerShell cmdlet that can easily tell you what version of SMB the client has negotiated with the File Server. You simply access a remote file server (or create a new mapping to it) and use Get-SmbConnection. Here’s an example:

PS C:\> Get-SmbConnection

ServerName   ShareName UserName            Credential          Dialect   NumOpens
----------   --------- --------            ----------          -------   --------
FileServer1 IPC$       DomainName\UserN... DomainName.Testi... 3.00      0
FileServer1 FileShare DomainName\UserN... DomainName.Testi... 3.00      14
FileServ2    FS2        DomainName\UserN... DomainName.Testi... 3.02      3
VNX3         Share1     DomainName\UserN... DomainName.Testi... 3.00      6
Filer2       Library    DomainName\UserN... DomainName.Testi... 3.00      8
DomainCtrl1 netlogon   DomainName\Compu... DomainName.Testi... 2.10      1

In the example above, a server called “FileServer1” was able to negotiate up to version 3.0. FileServ2 can use version 3.02. That means that both the client and the server support the latest version of the SMB protocol. You can also see that another server called “DomainCtrl1” was only able to negotiate up to version 2.1. You can probably guess that it’s a domain controller running Windows Server 2008 R2. Some of the servers on the list are not running Windows, showing the dialect that these non-Windows SMB implementations negotiated with this specific Windows client.

If you just want to find the version of SMB running on your own computer, you can use a loopback share combined with the Get-SmbConnection cmdlet. Here’s an example:

PS C:\> dir \\localhost\c$

Directory: \\localhost\c$

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
d----         5/19/2012   1:54 AM            PerfLogs
d-r--          6/1/2012 11:58 PM            Program Files
d-r--          6/1/2012 11:58 PM            Program Files (x86)
d-r--         5/24/2012   3:56 PM            Users
d----          6/5/2012   3:00 PM            Windows

PS C:\> Get-SmbConnection -ServerName localhost

ServerName ShareName UserName            Credential          Dialect NumOpens
---------- --------- --------            ----------          ------- --------
localhost   c$         DomainName\UserN... DomainName.Testi... 3.02     0

You have about 10 seconds after you issue the “dir” command to run the “Get-SmbConnection” cmdlet. The SMB client will tear down the connections if there is no activity between the client and the server. It might help to know that you can use the alias “gsmbc” instead of the full cmdlet name.

5. Features and Capabilities

Here’s a very short summary of what changed with each version of SMB:

From SMB 1.0 to SMB 2.0 - The first major redesign of SMB
- Increased file sharing scalability
- Improved performance
  - Request compounding
  - Asynchronous operations
  - Larger reads/writes
- More secure and robust
  - Small command set
  - Signing now uses HMAC SHA-256 instead of MD5
  - SMB2 durability
From SMB 2.0 to SMB 2.1
- File leasing improvements
- Large MTU support
- BranchCache
From SMB 2.1 to SMB 3.0
- Availability
  - SMB Transparent Failover
  - SMB Witness
  - SMB Multichannel
- Performance
  - SMB Scale-Out
  - SMB Direct (SMB 3.0 over RDMA)
  - SMB Multichannel
  - Directory Leasing
  - BranchCache V2
- Backup
  - VSS for Remote File Shares
- Security
  - SMB Encryption using AES-CCM (Optional)
  - Signing now uses AES-CMAC
- Management
  - SMB PowerShell
  - Improved Performance Counters
  - Improved Eventing
From SMB 3.0 to SMB 3.02
- Automatic rebalancing of Scale-Out File Server clients
- Improved performance of SMB Direct (SMB over RDMA)
- Support for multiple SMB instances on a Scale-Out File Server

You can get additional details on the SMB 2.0 improvements listed above at
http://blogs.technet.com/b/josebda/archive/2008/12/09/smb2-a-complete-redesign-of-the-main-remote-file-protocol-for-windows.aspx

You can get additional details on the SMB 3.0 improvements listed above at
http://blogs.technet.com/b/josebda/archive/2012/05/03/updated-links-on-windows-server-2012-file-server-and-smb-3-0.aspx

You can get additional details on the SMB 3.02 improvements in Windows Server 2012 R2 at
http://technet.microsoft.com/en-us/library/hh831474.aspx

6. Recommendation

We strongly encourage you to update to the latest version of SMB, which will give you the most scalability, the best performance, the highest availability and the most secure SMB implementation.

Keep in mind that Windows Server 2012 Hyper-V and Windows Server 2012 R2 Hyper-V only support SMB 3.0 for remote file storage. This is due mainly to the availability features (SMB Transparent Failover, SMB Witness and SMB Multichannel), which did not exist in previous versions of SMB. The additional scalability and performance is also very welcome in this virtualization scenario. The Hyper-V Best Practices Analyzer (BPA) will warn you if an older version is detected.

7. Conclusion

We’re excited about SMB3, but we are also always concerned about keeping as much backwards compatibility as possible. Both SMB 3.0 and SMB 3.02 bring several key new capabilities and we encourage you to learn more about them. We hope you will be convinced to start planning your upgrades as early as possible.

Note 1: Protocol Documentation

If you consider yourself an SMB geek and you actually want to understand the SMB NEGOTIATE command in greater detail, you can read the [MS-SMB2-Preview] protocol documentation (which covers SMB 2.0, 2.1, 3.0 and 3.02), currently available from http://msdn.microsoft.com/en-us/library/ee941641.aspx. In regards to protocol version negotiation, you should pay attention to the following sections of the document:

1.7: Versioning and Capability Negotiation
2.2.3: SMB2 Negotiate Request
2.2.4: SMB2 Negotiate Response

Section 1.7 includes this nice state diagram describing the inner workings of protocol negotiation:

Note 2: Third-party implementations

There are several implementations of the SMB protocol from someone other than Microsoft. If you use one of those implementations of SMB, you should ask whoever is providing the implementation which version of SMB they implement for each version of their product. Here are a few of these implementations of SMB:

Apple – Up to SMB2 implemented in OS X 10 Mavericks - http://images.apple.com/osx/preview/docs/OSX_Mavericks_Core_Technology_Overview.pdf
EMC– Up to SMB3 implemented in VNX - http://www.emc.com/collateral/white-papers/h11427-vnx-introduction-smb-30-support-wp.pdf
Linux (Client) – SMB 2.1 and SMB 3.0 (even minimum SMB 3.02 support) implemented in the Linux kernel 3.11 or higher – http://www.snia.org/sites/default/files2/SDC2013/presentations/Revisions/StevenFrench_SMB3_Meets_Linux_ver3_revision.pdf
NetApp– Up to SMB3 implemented in Data ONTAP 8.2 - https://communities.netapp.com/community/netapp-blogs/cloud/blog/2013/06/11/clustered-ontap-82-with-windows-server-2012-r2-and-system-center-2012-r2-innovation-in-storage-and-the-cloud
Samba (Server) – Up to SMB3 implemented in Samba 4.1 - http://www.samba.org/samba/history/samba-4.1.0.html

Please note that is not a complete list of implementations and the list is bound to become obsolete the minute I post it. Please refer to the specific implementers for up-to-date information on their specific implementations and which version and optional portions of the protocol they offer.

You also want to review the SNIA Tutorial SMB Remote File Protocol (including SMB 3.0). The SNIA Data Storage Innovation Conference (DSI’14) in April 22-24 2014 is offering an updated version of this tutorial.

↧

The Deprecation of SMB1 – You should be planning to get rid of this old SMB dialect

April 21, 2015, 11:28 am

≫ Next: What’s new in SMB 3.1.1 in the Windows Server 2016 Technical Preview 2

≪ Previous: Windows Server 2012 R2: Which version of the SMB protocol (SMB 1.0, SMB 2.0, SMB 2.1, SMB 3.0 or SMB 3.02) are you using?

I regularly get a question about when will SMB1 be completely removed from Windows. This blog post summarizes the current state of this old SMB dialect in Windows client and server.

1) SMB1 is deprecated, but not yet removed

We already added SMB1 to the Windows Server 2012 R2 deprecation list in June 2013. That does not mean it’s fully removed, but that the feature is “planned for potential removal in subsequent releases”. You can find the Windows Server 2012 R2 deprecation list at https://technet.microsoft.com/en-us/library/dn303411.aspx.

2) Windows Server 2003 is going away

The last supported Windows operating system that can only negotiate SMB1 is Windows Server 2003. All other currently supported Windows operating systems (client and server) are able to negotiate SMB2 or higher. Windows Server 2003 support will end on July 14 of this year, as you probably heard.

3) SMB versions in current releases of Windows and Windows Server

Aside from Windows Server 2003, all other versions of Windows (client and server) support newer versions of SMB:

Windows Server 2008 or Windows Vista – SMB1 or SMB2
Windows Server 2008 R2 or Windows 7 – SMB1 or SMB2
Windows Server 2012 and Windows 8 – SMB1, SMB2 or SMB3
Windows Server 2012 R2 and Windows 8.1 – SMB1, SMB2 or SMB3

For details on specific dialects and how they are negotiated, see this blog post on SMB dialects and Windows versions.

4) SMB1 removal in Windows Server 2012 R2 and Windows 8.1

In Windows Server 2012 R2 and Windows 8.1, we made SMB1 an optional component that can be completely removed. That optional component is enabled by default, but a system administrator now has the option to completely disable it. For more details, see this blog post on how to completely remove SMB1 in Windows Server 2012 R2.

5) SMB1 removal in Windows 10 Technical Preview and Windows Server Technical Preview

SMB1 will continue to be an optional component enabled by default with Windows 10, which is scheduled to be released in 2015. The next version of Windows Server, which is expected in 2016, will also likely continue to have SMB as an optional component enabled by default. In that release we will add an option to audit SMB1 usage, so IT Administrators can assess if they can disable SMB1 on their own.

6) What you should be doing about SMB1

If you are a systems administrator and you manage IT infrastructure that relies on SMB1, you should prepare to remove SMB1. Once Windows Server 2003 is gone, the main concern will be third party software or hardware like printers, scanners, NAS devices and WAN accelerators. You should make sure that any new software and hardware that requires the SMB protocol is able to negotiate newer versions (at least SMB2, preferably SMB3). For existing devices and software that only support SMB1, you should contact the manufacturer for updates to support the newer dialects.

If you are a software or hardware manufacturer that has a dependency on the SMB1 protocol, you should have a clear plan for removing any such dependencies. Your hardware or software should be ready to operate in an environment where Windows clients and servers only support SMB2 or SMB3. While it’s true that today SMB1 still works in most environments, the fact that the feature is deprecated is a warning that it could go away at any time.

7) Complete removal of SMB1

Since SMB1 is a deprecated component, we will assess for its complete removal with every new release.

↧

What’s new in SMB 3.1.1 in the Windows Server 2016 Technical Preview 2

May 5, 2015, 6:40 pm

≫ Next: Drive Performance Report Generator – PowerShell script using DiskSpd by Arnaud Torres

≪ Previous: The Deprecation of SMB1 – You should be planning to get rid of this old SMB dialect

1. Introduction

Every new version of Windows brings updates to our main remote file protocol, known as SMB (Server Message Block).

If you’re not familiar with it, you can find some information in this previous blog post: Windows Server 2012 R2: Which version of the SMB protocol (SMB 1.0, SMB 2.0, SMB 2.1, SMB 3.0 or SMB 3.02) are you using?

In this blog post, you’ll see what changed with the new version of SMB that comes with the Windows 10 Insider Preview released in late April 2015 and the Windows Server 2016 Technical Preview 2 released in early May 2015.

2. Protocols Changes in SMB 3.1.1

This section covers changes in SMB 3.1.1 related to the protocol itself.

The Protocol Preview document fully describes these changes: [MS-SMB2-Diff]- Server Message Block (SMB) Protocol Versions 2 and 3, but you can see the highlights below.

2.1. Pre-Authentication Integrity

Pre-authentication integrity provides improved protection from a man-in-the-middle attacker tampering with SMB’s connection establishment and authentication messages.

Pre-Auth integrity verifies all the “negotiate” and “session setup” exchanges used by SMB with a strong cryptographic hash (SHA-512).

If your client and your server establish an SMB 3.1.1 session, you can be sure that no one has tampered with the connection and session properties.

Using SMB signing on top of an SMB 3.1.1 session protects you from an attacker tampering with any packets.

Using SMB encryption on top of an SMB 3.1.1 session protects you from an attacker tampering with or eavesdropping on any packets.

Although there is a cost to enable SMB signing or SMB encryption, we highly recommend enabling one of them.

Note: While these changes improve overall security, they might interfere with some solutions that rely on modifying SMB network traffic, like certain kinds of WAN accelerators.

2.2. SMB Encryption Improvements

SMB Encryption, introduced with SMB 3.0, used a fixed cryptographic algorithm: AES-128-CCM.

Since then, we have learned that AES-128-GCM performs better in most modern processors.

To take advantage of that, SMB 3.1.1 offers a mechanism to negotiate the crypto algorithm per connection, with options for AES-128-CCM and AES-128-GCM.

We made AES-128-GCM the default for new Windows versions, while older versions will continue to use AES-128-CCM.

With this flexible infrastructure for negotiation in place, we could add more algorithms in the future.

We observed that moving from AES-128-CCM to AES-128-GCM showed a 2x improvement in certain scenarios, like copying large files over an encrypted SMB connection.

2.3. Cluster Dialect Fencing

Provides support for cluster rolling upgrade for Scale-Out File Servers. For details, see http://technet.microsoft.com/en-us/library/dn765474.aspx#BKMK_RollingUpgrade

In this new scenario, a single SMB server appears to support different maximum dialects of SMB, depending on whether the SMB client is accessing clustered or non-clustered file shares.

For local, non-clustered file shares, the server offers up to 3.1.1 during dialect negotiation.

For clustered shares, if the cluster is in mixed mode (before upgrading the cluster functional level), it will offer up to 3.0.2 during dialect negotiation.

After you upgrade the cluster functional level, it then offers all clients the new 3.1.1 dialect.

3. Other SMB changes that are not protocol-related

There are other changes in Windows that change the SMB Client or SMB Server implementation, but not the protocol itself.

Here are a few important changes in that category:

3.1. Removing RequireSecureNegotiate setting

In previous versions of SMB, we introduced “Secure Negotiate”, where the SMB client and server verify integrity of the SMB negotiate request and response messages.

Because some third-party implementations of SMB did not correctly perform this negotiation, we introduced a switch to disable “Secure Negotiate”. We explain this in more detail in this blog post.

Since we have learned via our SMB PlugFests that third parties have fixed their implementations, we are removing the option to bypass “Secure Negotiate” and SMB always performs negotiate validation if the connection’s dialect is 2.x.x or 3.0.x.

Note 1: For SMB 3.1.1 clients and servers, the new Pre-Authentication Integrity feature (described in item 2.1 above) supersedes “Secure Negotiate” with many advantages.

Note 2: With the new release, any third party SMB 2.x.x or SMB 3.0.x implementations that do not implement “Secure Negotiate” will be unable to connect to Windows.

Note 3: While this change improves overall security, it might interfere with some solutions that rely on modifying SMB network traffic, like certain kinds of WAN accelerators.

3.2. Dialects with non-zero revision number now reported with the x.y.z notation

As you probably noticed throughout this blog post, we’re using 3 separate digits to notate the version of SMB.

In the past, you might have seen us talk about SMB 3.02. Now we call that SMB 3.0.2.

Note that there is no change when the revision number is 0, like SMB 2.1 or SMB 3.0 (we don’t call them SMB 2.1.0 or SMB 3.0.0).

This new format avoids confusion when comparing SMB dialects and better represents the actual version information used by SMB.

You can use the Get-SmbConnection cmdlet on the Windows SMB client to report the currently used SMB dialects.

4. Which protocol is negotiated?

Please note that SMB clients and SMB servers negotiate the SMB dialect that they will use based on each side’s offer.

Here’s a table to help you understand what version you will end up using, depending on what Windows version is running as the SMB client and what version of Windows is running as the SMB server:

OS	Windows 10 WS* 2016 TP2	Windows 8.1 WS* 2012 R2	Windows 8 WS* 2012	Windows 7 WS* 2008 R2	Windows Vista WS* 2008	Previous versions
Windows 10 WS* 2016 TP2	SMB 3.1.1	SMB 3.0.2	SMB 3.0	SMB 2.1	SMB 2.0.2	SMB 1.x
Windows 8.1 WS* 2012 R2	SMB 3.0.2	SMB 3.0.2	SMB 3.0	SMB 2.1	SMB 2.0.2	SMB 1.x
Windows 8 WS* 2012	SMB 3.0	SMB 3.0	SMB 3.0	SMB 2.1	SMB 2.0.2	SMB 1.x
Windows 7 WS* 2008 R2	SMB 2.1	SMB 2.1	SMB 2.1	SMB 2.1	SMB 2.0.2	SMB 1.x
Windows Vista WS* 2008	SMB 2.0.2	SMB 2.0.2	SMB 2.0.2	SMB 2.0.2	SMB 2.0.2	SMB 1.x
Previous versions	SMB 1.x	SMB 1.x	SMB 1.x	SMB 1.x	SMB 1.x	SMB 1.x

* WS = Windows Server

Note: Earlier Windows 10 and Windows Server 2016 previews used SMB dialect version 3.1.

5. Considering your options for removing the older SMB1 protocol

When Windows Server 2003 hits the end of its extended support later this year, the last supported version of Windows that only works with SMB1 will be gone.

SMB1 is already a separate component in Windows that you can completely remove. However, up to this point, Windows still enables it by default for compatibility reasons.

The next logical step (which we are planning for a future release of Windows) will be to ship SMB1 disabled by default, but still available if necessary.

To help with this transition, you can now enable auditing of SMB1 traffic in your SMB server using PowerShell. This will alert you via events if any clients are still using SMB1.

To enable auditing of SMB1 traffic, use the cmdlet: Set-SmbServerConfiguration –AuditSmb1Access $true

To view the SMB1 events, use the cmdlet: Get-WinEvent -LogName Microsoft-Windows-SMBServer/Audit

If you feel confident that there are no SMB1 clients in your network, you can uninstall SMB1 from your server using the cmdlet: Remove-WindowsFeature FS-SMB1

6. Conclusion

I hope this blog post helps you prepare for the upcoming changes in SMB.

We also recommend that you download the SNIA Tutorial on SMB 3, which we recently updated to include details of the 3.1.1 dialect. You can find a copy of that tutorial at http://www.snia.org/sites/default/files2/DSI2015/presentations/FileSystems/JoseBarreto_SMB3_remote%20file%20protocol.pdf

↧

Drive Performance Report Generator – PowerShell script using DiskSpd by Arnaud Torres

July 3, 2015, 12:40 pm

≫ Next: Twenty years as a Microsoft Certified Professional – time flies when you’re having fun

≪ Previous: What’s new in SMB 3.1.1 in the Windows Server 2016 Technical Preview 2

Arnaud Torres is a Senior Premier Field Engineer at Microsoft in France who sent me the PowerShell script below called “Drive Performance Report Generator”.

He created the script to test a wide range of profiles in one run to allow people to build a baseline of their storage using DiskSpd.EXE.

The script is written in PowerShell v1 and was tested on a Windows Server 2008 SP2 (really!), Windows Server 2012 R2 and Windows 10.

It displays results in real time, is highly documented and creates a text report which can be imported as CSV in Excel.

Thanks to Arnaud for sharing!

———————-

# Drive performance Report Generator

# by Arnaud TORRES

# Microsoft provides script, macro, and other code examples for illustration only, without warranty either expressed or implied, including but not

# limited to the implied warranties of merchantability and/or fitness for a particular purpose. This script is provided ‘as is’ and Microsoft does not

# guarantee that the following script, macro, or code can be used in all situations.

# Script will stress your computer CPU and storage, be sure that no critical workload is running

# Clear screen

Clear

write-host “DRIVE PERFORMANCE REPORT GENERATOR” -foregroundcolor green

write-host “Script will stress your computer CPU and storage layer (including network if applciable !), be sure that no critical workload is running” -foregroundcolor yellow

write-host “Microsoft provides script, macro, and other code examples for illustration only, without warranty either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose. This script is provided ‘as is’ and Microsoft does not guarantee that the following script, macro, or code can be used in all situations.” -foregroundcolor darkred

” “

“Test will use all free space on drive minus 2 GB !”

“If there are less than 4 GB free test will stop”

# Disk to test

$Disk = Read-Host ‘Which disk would you like to test ? (example : D:)’

# $Disk = “D:”

if ($disk.length -ne 2){“Wrong drive letter format used, please specify the drive as D:”

Exit}

if ($disk.substring(1,1) -ne “:”){“Wrong drive letter format used, please specify the drive as D:”

Exit}

$disk = $disk.ToUpper()

# Reset test counter

$counter = 0

# Use 1 thread / core

$Thread = “-t”+(Get-WmiObject win32_processor).NumberofCores

# Set time in seconds for each run

# 10-120s is fine

$Time = “-d1″

# Outstanding IOs

# Should be 2 times the number of disks in the RAID

# Between 8 and 16 is generally fine

$OutstandingIO = “-o16″

# Disk preparation

# Delete testfile.dat if it exists

# The test will use all free space -2GB

$IsDir = test-path -path “$Disk\TestDiskSpd”

$isdir

if ($IsDir -like “False”){new-item -itemtype directory -path “$Disk\TestDiskSpd\”}

# Just a little security, in case we are working on a compressed drive …

compact /u /s $Disk\TestDiskSpd\

$Cleaning = test-path -path “$Disk\TestDiskSpd\testfile.dat”

if ($Cleaning -eq “True”)

{“Removing current testfile.dat from drive”

remove-item $Disk\TestDiskSpd\testfile.dat}

$Disks = Get-WmiObject win32_logicaldisk

$LogicalDisk = $Disks | where {$_.DeviceID -eq $Disk}

$Freespace = $LogicalDisk.freespace

$FreespaceGB = [int]($Freespace / 1073741824)

$Capacity = $freespaceGB – 2

$CapacityParameter = “-c”+$Capacity+”G”

$CapacityO = $Capacity * 1073741824

if ($FreespaceGB -lt “4”)

{

“Not enough space on the Disk ! More than 4GB needed”

Exit

}

write-host ” “

$Continue = Read-Host “You are about to test $Disk which has $FreespaceGB GB free, do you wan’t to continue ? (Y/N) “

if ($continue -ne “y” -or $continue -ne “Y”){“Test Cancelled !!”

Exit}

” “

“Initialization can take some time, we are generating a $Capacity GB file…”

” “

# Initialize outpout file

$date = get-date

# Add the tested disk and the date in the output file

“Disque $disk, $date” >> ./output.txt

# Add the headers to the output file

“Test N#, Drive, Operation, Access, Blocks, Run N#, IOPS, MB/sec, Latency ms, CPU %” >> ./output.txt

# Number of tests

# Multiply the number of loops to change this value

# By default there are : (4 blocks sizes) X (2 for read 100% and write 100%) X (2 for Sequential and Random) X (4 Runs of each)

$NumberOfTests = 64

” “

write-host “TEST RESULTS (also logged in .\output.txt)” -foregroundcolor yellow

# Begin Tests loops

# We will run the tests with 4K, 8K, 64K and 512K blocks

(4,8,64,512) | % {

$BlockParameter = (“-b”+$_+”K”)

$Blocks = (“Blocks “+$_+”K”)

# We will do Read tests and Write tests

(0,100) | % {

if ($_ -eq 0){$IO = “Read”}

if ($_ -eq 100){$IO = “Write”}

$WriteParameter = “-w”+$_

# We will do random and sequential IO tests

(“r”,”si”) | % {

if ($_ -eq “r”){$type = “Random”}

if ($_ -eq “si”){$type = “Sequential”}

$AccessParameter = “-“+$_

# Each run will be done 4 times

(1..4) | % {

# The test itself (finally !!)

$result = .\diskspd.exe $CapacityPArameter $Time $AccessParameter $WriteParameter $Thread $OutstandingIO $BlockParameter -h -L $Disk\TestDiskSpd\testfile.dat

# Now we will break the very verbose output of DiskSpd in a single line with the most important values

foreach ($line in $result) {if ($line -like “total:*”) { $total=$line; break } }

foreach ($line in $result) {if ($line -like “avg.*”) { $avg=$line; break } }

$mbps = $total.Split(“|”)[2].Trim()

$iops = $total.Split(“|”)[3].Trim()

$latency = $total.Split(“|”)[4].Trim()

$cpu = $avg.Split(“|”)[1].Trim()

$counter = $counter + 1

# A progress bar, for the fun

Write-Progress -Activity “.\diskspd.exe $CapacityPArameter $Time $AccessParameter $WriteParameter $Thread $OutstandingIO $BlockParameter -h -L $Disk\TestDiskSpd\testfile.dat” -status “Test in progress” -percentComplete ($counter / $NumberofTests * 100)

# Remove comment to check command line “.\diskspd.exe $CapacityPArameter $Time $AccessParameter $WriteParameter $Thread -$OutstandingIO $BlockParameter -h -L $Disk\TestDiskSpd\testfile.dat”

# We output the values to the text file

“Test $Counter,$Disk,$IO,$type,$Blocks,Run $_,$iops,$mbps,$latency,$cpu” >> ./output.txt

# We output a verbose format on screen

“Test $Counter, $Disk, $IO, $type, $Blocks, Run $_, $iops iops, $mbps MB/sec, $latency ms, $cpu CPU”

}

↧

Twenty years as a Microsoft Certified Professional – time flies when you’re having fun

August 16, 2015, 9:03 am

≫ Next: Raw notes from the Storage Developer Conference 2015 (SNIA SDC 2015)

≪ Previous: Drive Performance Report Generator – PowerShell script using DiskSpd by Arnaud Torres

I just noticed that last week was the 20th anniversary of my first Microsoft certification. I had to travel nearly 500 miles (from Fortaleza to Recife) to reach the closest official testing center available in Brazil in August 1995.

You’re probably thinking that I started by taking the Windows 95 exam, but it was actually the Windows 3.1 exam (which included a lot of MS-DOS 6.x stuff). The Windows 95 exam was my next one, but that only happened over a year later in December 1996.

I went on to take absolutely all of the Windows NT 4.0 and Windows 2000 exams (many of them in their beta version). At that point we had multiple Microsoft Certified Partners in Fortaleza and I worked for one of them.

I continued to take lots of exams even after moved to the US in October 2000 and after I joined Microsoft in October 2002. I only slowed down a bit after joining the Windows Server engineering team in October 2007.

In 2009 I achieved my last certification as a Microsoft Certified Master on SQL Server 2008. That took a few weeks of training, a series of written exams and a final, multi-hour lab exam. Exciting stuff! That also later granted me a charter certification as Microsoft Certified Solutions Master (Data Platform), Microsoft Certified Solutions Expert (Data Platform) and Microsoft Certified Solutions Associate (SQL Server 2012).

My full list is shown below. In case you’re wondering, the Windows 10 exam (Configuring Windows Devices) is already in development and you can find the details at https://www.microsoft.com/learning/en-us/exam-70-697.aspx.

↧

Raw notes from the Storage Developer Conference 2015 (SNIA SDC 2015)

September 21, 2015, 6:16 am

≫ Next: My Top Reasons to Use OneDrive

≪ Previous: Twenty years as a Microsoft Certified Professional – time flies when you’re having fun

Notes and disclaimers:

This blog post contains raw notes for some of the SNIA’s SDC 2015 presentations (SNIA’s Storage Developers Conference 2015)
These notes were typed during the talks and they may include typos and my own misinterpretations.
Text in the bullets under each talk are quotes from the speaker or text from the speaker slides, not my personal opinion.
If you feel that I misquoted you or badly represented the content of a talk, please add a comment to the post.
I spent limited time fixing typos or correcting the text after the event. There are only so many hours in a day…
I have not attended all sessions (since there are many being delivered at a time, that would actually not be possible :-)…
SNIA usually posts the actual PDF decks a few weeks after the event. Attendees have access immediately.
You can find the event agenda at http://www.snia.org/events/storage-developer/agenda

Understanding the Intel/Micron 3D XPoint Memory
Jim Handy, General Director, Objective Analysis

Memory analyst, SSD analyst, blogs: http://thememoryguy.com, http://thessdguy.com
Not much information available since the announcement in July: http://newsroom.intel.com/docs/DOC-6713
Agenda: What? Why? Who? Is the world ready for it? Should I care? When?
What: Picture of the 3D XPoint concept (pronounced 3d-cross-point). Micron’s photograph of “the real thing”.
Intel has researched PCM for 45 years. Mentioned in an Intel article at “Electronics” in Sep 28, 1970.
The many elements that have been tried shown in the periodic table of elements.
NAND laid the path to the increased hierarchy levels. Showed prices of DRAM/NAND from 2001 to 2015. Gap is now 20x.
Comparing bandwidth to price per gigabytes for different storage technologies: Tape, HDD, SSD, 3D XPoint, DRAM, L3, L2, L1
Intel diagram mentions PCM-based DIMMs (far memory) and DDR DIMMs (near memory).
Chart with latency for HDD SAS/SATA, SSD SAS/SATA, SSD NVMe, 3D XPoint NVMe – how much of it is the media, how much is the software stack?
3D Xpoint’s place in the memory/storage hierarchy. IOPS x Access time. DRAM, 3D XPoint (Optane), NVMe SSD, SATA SSD
Great gains at low queue depth. 800GB SSD read IOPS using 16GB die. IOPS x queue depth of NAND vs. 3D XPoint.
Economic benefits: measuring $/write IOPS for SAS HDD, SATA SSD, PCIe SSD, 3D XPoint
Timing is good because: DRAM is running out of speed, NVDIMMs are catching on, some sysadmins understand how to use flash to reduce DRAM needs
Timing is bad because: Nobody can make it economically, no software supports SCM (storage class memory), new layers take time to establish Why should I care: better cost/perf ratio, lower power consumption (less DRAM, more perf/server, lower OpEx), in-memory DB starts to make sense
When? Micron slide projects 3D XPoint at end of FY17 (two months ahead of CY). Same slide shows NAND production surpassing DRAM production in FY17.
Comparing average price per GB compared to the number of GB shipped over time. It takes a lot of shipments to lower price.
Looking at the impact in the DRAM industry if this actually happens. DRAM slows down dramatically starting in FY17, as 3D XPoint revenues increase (optimistic).

Next Generation Data Centers: Hyperconverged Architectures Impact On Storage
Mark OConnell, Distinguished Engineer, EMC

History: Client/Server –> shared SANs –> Scale-Out systems
>> Scale-Out systems: architecture, expansion, balancing
>> Evolution of the application platform: physical servers à virtualization à Virtualized application farm
>> Virtualized application farms and Storage: local storage à Shared Storage (SAN) à Scale-Out Storage à Hyper-converged
>> Early hyper-converged systems: HDFS (Hadoop) à JVM/Tasks/HDFS in every node
Effects of hyper-converged systems
>> Elasticity (compute/storage density varies)
>> App management, containers, app frameworks
>> Storage provisioning: frameworks (openstack swift/cinder/manila), pure service architectures
>> Hybrid cloud enablement. Apps as self-describing bundles. Storage as a dynamically bound service. Enables movement off-prem.

Implications of Emerging Storage Technologies on Massive Scale Simulation Based Visual Effects
Yahya H. Mirza, CEO/CTO, Aclectic Systems Inc

Steve Jobs quote: “You‘ve got to start with the customer experience and work back toward the technology”.
Problem 1: Improve customer experience. Higher resolution, frame rate, throughput, etc.
Problem 2: Production cost continues to rise.
Problem 3: Time to render single frame remains constant.
Problem 4: Render farm power and cooling increasing. Coherent shared memory model.
How do you reduce customer CapEx/OpEx. Low efficiency: 30% CPU. Prooblem is memory access latency and I/O.
Production workflow: modeling, animation/simulation/shading, lighting, rendering, compositing. More and more simulation.
Concrete production experiment: 2005. Story boards. Attempt to create a short film. Putting himself in the customer’s shoes. Shot decomposition.
Real 3-minute short costs $2 million. Animatic to pitch the project.
Character modeling and development. Includes flesh and muscle simulation. A lot of it done procedurally.
Looking at Disney’s “Big Hero 6”, DreamWorks’ “Puss in Boots” and Weta’s “The Hobbit”, including simulation costs, frame rate, resolution, size of files, etc.
Physically based rendering: global illumination effects, reflection, shadows. Comes down to light transport simulation, physically based materials description.
Exemplary VFX shot pipeline. VFX Tool (Houdini/Maya), Voxelized Geometry (OpenVDB), Scene description (Alembic), Simulation Engine (PhysBam), Simulation Farm (RenderFarm), Simulation Output (OpenVDB), Rendering Engine (Mantra), Render Farm (RenderFarm), Output format (OpenEXR), Compositor (Flame), Long-term storage.
One example: smoke simulation – reference model smoke/fire VFX. Complicated physical model. Hotspot algorithms: monte-carlo integration, ray-intersection test, linear algebra solver (multigrid).
Storage implications. Compute storage (scene data, simulation data), Long term storage.
Is public cloud computing viable for high-end VFX?
Disney’s data center. 55K cores across 4 geos.
Vertically integrated systems are going to be more and more important. FPGAs, ARM-based servers.
Aclectic Colossus smoke demo. Showing 256x256x256.
We don’t want coherency; we don’t want sharing. Excited about Intel OmniPath.
http://www.intel.com/content/www/us/en/high-performance-computing-fabrics/omni-path-architecture-fabric-overview.html

How Did Human Cells Build a Storage Engine?
Sanjay Joshi, CTO Life Sciences, EMC

Human cell, Nuclear DNA, Transcription and Translation, DNA Structure
The data structure: [char(3*10^9) human_genome] strand
3 gigabases [(3*10^9)*2]/8 = ~750MB. With overlaps, ~1GB per cell. 15-70 trillion cells.
Actual files used to store genome are bigger, between 10GB and 4TB (includes lots of redundancy).
Genome sequencing will surpass all other data types by 2040
Protein coding portion is just a small portion of it. There’s a lot we don’t understand.
Nuclear DNA: Is it a file? Flat file system, distributed, asynchronous. Search header, interpret, compile, execute.
Nuclear DNA properties: Large:~20K genes/cell, Dynamic: append/overwrite/truncate, Semantics: strict, Consistent: No, Metadata: fixed, View: one-to-many
Mitochondrial DNA: Object? Distributed hash table, a ring with 32 partitions. Constant across generations.
Mitochondrial DNA: Small: ~40 genes/cell, Static: constancy, energy functions, Semantics: single origin, Consistent: Yes, Metadata: system based, View: one-to-one
File versus object. Comparing Nuclear DNA and Mitochondrial DNA characteristics.
The human body: 7500 names parts, 206 regularly occurring bones (newborns close to 300), ~640 skeletal muscles (320 pairs), 60+ organs, 37 trillion cells. Distributed cluster.
Mapping the ISO 7 layers to this system. Picture.
Finite state machine: max 10^45 states at 4*10^53 state-changes/sec. 10^24 NOPS (nucleotide ops per second) across biosphere.
Consensus in cell biology: Safety: under all conditions: apoptosis. Availability: billions of replicate copies. Not timing dependent: asynchronous. Command completion: 10 base errors in every 10,000 protein translation (10 AA/sec).
Object vs. file. Object: Maternal, Static, Haploid. Small, Simple, Energy, Early. File: Maternal and paternal, Diploid. Scalable, Dynamic, Complex. All cells are female first.

Move Objects to LTFS Tape Using HTTP Web Service Interface
Matt Starr, Chief Technical Officer, Spectra Logic
Jeff Braunstein, Developer Evangelist, Spectra Logic

Worldwide data growth: 2009 = 800 EB, 2015 = 6.5ZB, 2020 = 35ZB
Genomics. 6 cows = 1TB of data. They keep it forever.
Video data. SD to Full HD to 4K UHD (4.2TB per hours) to 8K UHD. Also kept forever.
Intel slide on the Internet minute. 90% of the people of the world never took a picture with anything but a camera phone.
IOT – Total digital info create or replicated.
$1000 genome scan take 780MB fully compressed. 2011 HiSeq-2000 scanner generates 20TB per month. Typical camera generates 105GB/day.
More and more examples.
Tape storage is the lowest cost. But it’s also complex to deploy. Comparing to Public and Private cloud…
Pitfalls of public cloud – chart of $/PB/day. OpEx per PB/day reaches very high for public cloud.
Risk of public cloud: Amazon has 1 trillion objects. If they lose 1% it would 10 billion objects.
Risk of public cloud: Nirvanix. VC pulled the plug in September 2013.
Cloud: Good: toolkits, naturally WAN friendly, user expectation: put it away.
What if: Combine S3/Object with tape. Spectra S3 – Front end is REST, backend is LTFS tape.
Cost: $.09/GB. 7.2PB. Potentially a $0.20 two-copy archive.
Automated: App or user-built. Semi-Automated: NFI or scripting.
Information available at https://developer.spectralogic.com
All the tools you need to get started. Including simulator of the front end (BlackPearl) in a VM.
S3 commands, plus data to write sequentially in bulk fashion.
Configure user for access, buckets.
Deep storage browser (source code on GitHub) allows you to browse the simulated storage.
SDK available in Java, C#, many others. Includes integration with Visual Studio (demonstrated).
Showing sample application. 4 lines of code from the SDK to move a folder to tape storage.
Q: Access times when not cached? Hours or minutes. Depends on if the tape is already in the drive. You can ask to pull those to cache, set priorities. By default GET has higher priority than PUT. 28TB or 56TB of cache.
Q: Can we use CIFS/NFS? Yes, there is an NFI (Network File Interface) using CIFS/NFS, which talks to the cache machine. Manages time-outs.
Q: Any protection against this being used as disk? System monitors health of the tape. Using an object-based interface helps.
Q: Can you stage a file for some time, like 24h? There is a large cache. But there are no guarantees on the latency. Keeping it on cache is more like Glacier. What’s the trigger to bring the data?
Q: Glacier? Considering support for it. Data policy to move to lower cost, move it back (takes time). Not a lot of product or customers demanding it. S3 has become the standard, not sure if Glacier will be that for archive.
Q: Drives are a precious resource. How do you handle overload? By default, reads have precedence over writes. Writes usually can wait.

Taxonomy of Differential Compression
Liwei Ren, Scientific Adviser, Trend Micro

Mathematical model for describing file differences
Lossless data compression categories: data compression (one file), differential compression (two files), data deduplication (multiple files)
Purposes: network data transfer acceleration and storage space reduction
Areas for DC – mobile phones’ firmware over the air, incremental update of files for security software, file synchronization and transfer over WAN, executable files
Math model – Diff procedure: Delta = T – R, Merge procedure: T = R + Delta. Model for reduced network bandwidth, reduced storage cost.
Applications: backup, revision control system, patch management, firmware over the air, malware signature update, file sync and transfer, distributed file system, cloud data migration
Diff model. Two operations: COPY (source address, size [, destination address] ), ADD (data block, size [, destination address] )
How to create the delta? How to encode the delta into a file? How to create the right sequence of COPY/ADD operations?
Top task is an effective algorithm to identify common blocks. Not covering it here, since it would take more than half an hour…
Modeling a diff package. Example.
How do you measure the efficiency of an algorithm? You need a cost model.
Categorizing: Local DC – LDC (xdelta, zdelta, bsdiff), Remote DC – RDC (rsync, RDC protocol, tsync), Iterative – IDC (proposed)
Categorizing: Not-in-place merging: general files (xdelta, zdelta, bsdiff), executable files (bsdiff, courgette)
Categorizing: In place merging: firmware as general files (FOTA), firmware as executable files (FOTA)
Topics in depth: LDC vs RDC vs IDC for general files
Topics in depth: LDC for executable files
Topics in depth: LDC for in-place merging

New Consistent Hashing Algorithms for Data Storage
Jason Resch, Software Architect, Cleversafe

Introducing a new algorithm for hashing.
Hashing is useful. Used commonly is distributed storage, distributed caching.
Independent users can coordinate (readers know where writers would write without talking to them).
Typically, resizing a Hash Table is inefficient. Showing example.
That’s why we need “Stable Hashing”. Showing example. Only a small portion of the keys need to be re-mapped.
Stable hashing becomes a necessity when system is stateful and/or transferring state is expensive,
Used in Caching/Routing (CARP), DHT/Storage (Gluster, DynamoDB, Cassandra, ceph, openstack)
Stable Hashing with Global Namespaces. If you have a file name, you know what node has the data.
Eliminates points of contention, no metadata systems. Namespace is fixed, but the system is dynamic.
Balances read/write load across nodes, as well as storage utilization across nodes.
Perfectly Stable Hashing (Rendezvous Hashing, Consistent Hashing). Precisely weighted (CARP, RUSH, CRUSH).
It would be nice to have something that would offer the characteristics of both.
Consistent: buckets inserted in random positions. Keys maps to the next node greater than that key. With a new node, only neighbors as disrupted. But neighbor has to send data to new node, might not distribute keys evenly.
Rendezvous: Score = Hash (Bucket ID || Key). Bucket with the highest score wins. When adding a new node, some of the keys will move to it. Every node is disrupted evenly.
CARP is rendezvous hashing with a twist. It multiples the scores by a “Load Factor” for each node. Allows for some nodes being more capable than others. Not perfectly stable: if node’s weighting changes or node is added, then all load factor must be recomputed.
RUSH/CRUSH: Hierarchical tree, with each node assigned a probability to go left/right. CRUSH makes the tree match the fault domains of the system. Efficient to add nodes, but not to remove or re-weight nodes.
New algorithm: Weighted Rendezvous Hashing (WRH). Both perfectly stable and precisely weighted.
WRH adjusts scores before weighting them. Unlike CARP, scores aren’t relatively scaled.
No unnecessary transfer of keys when adding/removing nodes. If adding node or increasing weight on node, other nodes will move keys to it, but nothing else. Transfers are equalized and perfectly efficient.
WRH is simple to implement. Whole python code showed in one slide.
All the magic is in one line: “Score = 1.0 / -math.log(hash_f)” – Proof of correctness provided for the math inclined.
How Cleversafe uses WRH. System is grown by set of devices. Devices have a lifecycle: added, possibly expanded, then retired.
Detailed explanation of the lifecycle and how keys move as nodes are added, expanded, retired.
Storage Resource Map. Includes weight, hash_seed. Hash seed enables a clever trick to retire device sets more efficiently.
Q: How to find data when things are being moved? If clients talk to the old node while keys are being moved. Old node will proxy the request to the new node.

Storage Class Memory Support in the Windows Operating System
Neal Christiansen, Principal Development Lead, Microsoft

Windows support for non-volatile storage medium with RAM-like performance is a big change.
Storage Class Memory (SCM): NVDIMM, 3D XPoint, others
Microsoft involved with the standardization efforts in this space.
New driver model necessary: SCM Bus Driver, SCM Disk Driver.
Windows Goals for SCM: Support zero-copy access, run most user-mode apps unmodified, option for 100% backward compatibility (new types of failure modes), sector granular failure modes for app compat.
Applications make lots of assumptions on the underlying storage
SCM Storage Drivers will support BTT – Block Translation Table. Provides sector-level atomicity for writes.
SCM is disruptive. Fastest performance and application compatibility can be conflicting goals.
SCM-aware File Systems for Windows. Volume modes: block mode or DAS mode (chosen at format time).
Block Mode Volumes – maintain existing semantics, full application compatibility
DAS Mode Volumes – introduce new concepts (memory mapped files, maximizes performance). Some existing functionality is lost. Supported by NTFS and ReFS.
Memory Mapped IO in DAS mode. Application can create a memory mapped section. Allowed when volumes resides on SCM hardware and the volume has been formatted for DAS mode.
Memory Mapped IO: True zero copy access. BTT is not used. No paging reads or paging writes.
Cached IO in DAS Mode: Cache manager creates a DAS-enabled cache map. Cache manager will copy directly between user’s buffer and SCM. Coherent with memory-mapped IO. App will see new failure patterns on power loss or system crash. No paging reads or paging writes.
Non-cached IO in DAS Mode. Will send IO down the storage stack to the SCM driver. Will use BTT. Maintains existing storage semantics.
If you really want the performance, you will need to change your code.
DAS mode eliminates traditional hook points used by the file system to implement features.
Features not in DAS Mode: NTFS encryption, NTS compression, NTFS TxF, ReFS integrity streams, ReFS cluster band, ReFS block cloning, Bitlocker volume encryption, snapshot via VolSnap, mirrored or parity via storage spaces or dynamic disks
Sparse files won’t be there initially but will come in the future.
Updated at the time the file is memory mapped: file modification time, mark file as modified in the USN journal, directory change notification
File System Filters in DAS mode: no notification that a DAS volume is mounted, filter will indicate via a flag if they understand DAS mode semantics.
Application compatibility with filters in DAS mode: No opportunity for data transformation filters (encryption, compression). Anti-virus are minimally impacted, but will need to watch for creation of writeable mapped sections (no paging writes anymore).
Intel NVLM library. Open source library implemented by Intel. Defines set of application APIs for directly manipulating files on SCM hardware.
NVLM library available for Linux today via GitHub. Microsoft working with Intel on a Windows port.
Q: XIP (Execute in place)? It’s important, but the plans have not solidified yet.
Q: NUMA? Can be in NUMA nodes. Typically, the file system and cache are agnostic to NUMA.
Q: Hyper-V? Not ready to talk about what we are doing in that area.
Q: Roll-out plan? We have one, but not ready to talk about it yet.
Q: Data forensics? We’ve yet to discuss this with that group. But we will.
Q: How far are you to completion? It’s running and working today. But it is not complete.
Q: Windows client? To begin, we’re targeting the server. Because it’s available there first.
Q: Effect on performance? When we’re ready to announce the schedule, we will announce the performance. The data about SCM is out there. It’s fast!
Q: Will you backport? Probably not. We generally move forward only. Not many systems with this kind of hardware will run a down level OS.
Q: What languages for the Windows port of NVML? Andy will cover that in his talk tomorrow.
Q: How fast will memory mapped be? Potentially as fast as DRAM, but depends on the underlying technology.

The Bw-Tree Key-Value Store and Its Applications to Server/Cloud Data Management in Production
Sudipta Sengupta, Principal Research Scientist, Microsoft Research

The B-Tree: key-ordered access to records. Balanced tree via page split and merge mechanisms.
Design tenets: Lock free operation (high concurrency), log-structure storage (exploit flash devices with fast random reads and inefficient random writes), delta updates to pages (reduce cache invalidation, garbage creation)
Bw-Tree Architecture: 3 layers: B-Tree (expose API, B-tree search/update, in-memory pages), Cache (logical page abstraction, move between memory and flash), Flash (reads/writes from/to storage, storage management).
Mapping table: Expose logical pages to access method layer. Isolates updates to single page. Structure for lock-free multi-threaded concurrency control.
Highly concurrent page updates with Bw-Tree. Explaining the process using a diagram.
Bw-Tree Page Split: No hard threshold for splitting unlike in classical B-Tree. B-link structure allows “half-split” without locking.
Flash SSDs: Log-Structured storage. Use log structure to exploit the benefits of flash and work around its quirks: random reads are fast, random in-place writes are expensive.
LLAMA Log-Structured Store: Amortize cost of writes over many page updates. Random reads to fetch a “logical page”.
Depart from tradition: logical page formed by linking together records on multiple physical pages on flash. Adapted from SkimpyStash.
Detailed diagram comparing traditional page writing with the writing optimized storage organization with Bw-Tree.
LLAMA: Optimized Logical Page Reads. Multiple delta records are packed when flushed together. Pages consolidated periodically in memory also get consolidated on flash when flushed.
LLAMA: Garbage collection on flash. Two types of record units in the log: Valid or Orphaned. Garbage collection starts from the oldest portion of the log. Earliest written record on a logical page is encountered first.
LLAMA: cache layer. Responsible for moving pages back and forth from storage.
Bw-Tree Checkpointing: Need to flush to buffer and to storage. LLAMA checkpoint for fast recovery.
Bw-Tree Fast Recovery. Restore mapping table from latest checkpoint region. Warm-up using sequential I/O.
Bw-Tree: Support for transactions. Part of the Deuteronomy Architecture.
End-to-end crash recovery. Data component (DC) and transactional component (TC) recovery. DC happens before TC.
Bw-Tree in production: Key-sequential index in SQL Server in-memory database
Bw-Tree in production: Indexing engine in Azure DocumentDB. Resource governance is important (CPU, Memory, IOPS, Storage)
Bw-Tree in production: Sorted key-value store in Bing ObjectStore.
Summary: Classic B-Tree redesigned for modern hardware and cloud. Lock-free, delta updating of pages, log-structure, flexible resource governor, transactional. Shipping in production.
Going forward: Layer transactional component (Deuteronomy Architecture, CIDR 2015), open-source the codebase

ReFS v2: Cloning, Projecting, and Moving Data
J.R. Tipton, Development Lead, Microsoft

Agenda: ReFS v1 primer, ReFS v2 at a glance, motivations for ReFS v2, cloning, translation, transformation
ReFS v1 primer: Windows allocate-on-write file system, Merkel trees verify metadata integrity, online data correction from alternate copies, online chkdsk
ReFS v2: Available in Windows Server 2016 TP4. Efficient, reliable storage for VMs, efficient parity, write tiering, read caching, block cloning, optimizations
Motivations for ReFS v2: cheap storage does not mean slow, VM density, VM provisioning, more hardware flavors (SLC, MLC, TLC flash, SMR)
Write performance. Magic does not work in a few environments (super fast hardware, small random writes, durable writes/FUA/sync/write-through)
ReFS Block Cloning: Clone any block of one file into any other block in another file. Full file clone, reorder some or all data, project data from one area into another without copy
ReFS Block Cloning: Metadata only operation. Copy-on-write used when needed (ReFS knows when).
Cloning examples: deleting a Hyper-V VM checkpoint, VM provisioning from image.
Cloning observations: app directed, avoids data copies, metadata operations, Hyper-V is the first but not the only one using this
Cloning is no free lunch: multiple valid copies will copy-on-write upon changes. metadata overhead to track state, slam dunk in most cases, but not all
ReFS cluster bands. Volume internally divvied up into bands that contain regular FS clusters (4KB, 64KB). Mostly invisible outside file system. Bands and clusters track independently (per-band metadata). Bands can come and go.
ReFS can move bands around (read/write/update band pointer). Efficient write caching and parity. Writes to bands in fast tier. Tracks heat per band. Moves bands between tiers. More efficient allocation. You can move from 100% triple mirroring to 95% parity.
ReFS cluster bands: small writes accumulate where writing is cheap (mirror, flash, log-structured arena), bands are later shuffled to tier where random writes are expensive (band transfers are fully sequential).
ReFS cluster bands: transformation. ReFS can do stuff to the data in a band (can happen in the background). Examples: band compaction (put cold bands together, squeeze out free space), band compression (decompress on read).
ReFS v2 summary: data cloning, data movement, data transformation. Smart when smart makes sense, switches to dumb when dumb is better. Takes advantages of hardware combinations. And lots of other stuff…

Innovator, Disruptor or Laggard, Where Will Your Storage Applications Live? Next Generation Storage
Bev Crair, Vice President and General Manager, Storage Group, Intel

The world is changing: information growth, complexity, cloud, technology.
Growth: 44ZB of data in all systems. 15% of the data is stored, since perceived cost is low.
Every minute of every day: 2013 : 8h of of video uploaded to YouTube, 47,000 apps downloaded, 200 million e-mails
Every minute of every day: 2015 : 300h of of video uploaded to YouTube, 51,000 apps downloaded, 204 million e-mails
Data never sleeps: the internet in real time. tiles showing activities all around the internet.
Data use pattern changes: sense and generate, collect and communicate, analyze and optimize. Example: HADRON collider
Data use pattern changes: from collection to analyzing data, valuable data now reside outside the organization, analyzing and optimizing unstructured data
Cloud impact on storage solutions: business impact, technology impact. Everyone wants an easy button
Intelligent storage: Deduplication, real-time compression, intelligent tiering, thin provisioning. All of this is a software problem.
Scale-out storage: From single system with internal network to nodes working together with an external network
Non-Volatile Memory (NVM) accelerates the enterprise: Examples in Virtualization, Private Cloud, Database, Big Data and HPC
Pyramid: CPU, DRAM, Intel DIMM (3D XPoint), Intel SSD (3D XPoint), NAND SSD, HDD, …
Storage Media latency going down dramatically. With NVM, the bottleneck is now mostly in the software stack.
Future storage architecture: complex chart with workloads for 2020 and beyond. New protocols, new ways to attach.
Intel Storage Technologies. Not only hardware, but a fair amount of software. SPDK, NVMe driver, Acceleration Library, Lustre, others.
Why does faster storage matter? Genome testing for cancer takes weeks, and the cancer mutates. Genome is 10TB. If we can speed up the time it takes to test it to one day, it makes a huge difference and you can create a medicine that saves a person’s life. That’s why it matters.

The Long-Term Future of Solid State Storage
Jim Handy, General Director, Objective Analysis

How we got here? Why are we in the trouble we’re at right now? How do we get ahead of it? Where is it going tomorrow?
Establishing a schism: Memory is in bytes (DRAM, Cache, Flash?), Storage is in blocks (Disk Tape, DVD, SAN, NAS, Cloud, Flash)
Is it really about block? Block, NAND page, DRAM pages, CPU cache lines. It’s all in pages anyway…
Is there another differentiator? Volatile vs. Persistent. It’s confusing…
What is an SSD? SSDs are nothing new. Going back to DEC Bulk Core.
Disk interfaces create delays. SSD vs HDD latency chart. Time scale in milliseconds.
Zooming in to tens of microseconds. Different components of the SSD delay. Read time, Transfer time, Link transfer, platform and adapter, software
Now looking at delays for MLC NAND ONFi2, ONFi3, PCIe x4 Gen3, future NVM on PCIe x4 Gen3
Changing the scale to tens of microseconds on future NVM. Link Transfer, Platform & adapter and Software now accounts for most of the latency.
How to move ahead? Get rid of the disk interfaces (PCIe, NVMe, new technologies). Work on the software: SNIA.
Why now? DRAM Transfer rates. Chart transfer rates for SDRAM, DDR, DDR2, DDR3, DDR4. Designing the bus takes most of the time.
DRAM running out of speed? We probably won’t see a DDR5. HMC or HBM a likely next step. Everything points to fixed memory sizes.
NVM to the rescue. DRAM is not the only upgrade path. It became cheaper to use NAND flash than DRAM to upgrade a PC.
NVM to be a new memory layer between DRAM & NAND: Intel/Micron 3D XPoint – “Optane”
One won’t kill the other. Future systems will have DRAM, NVM, NAND, HDD. None of them will go away…
New memories are faster than NAND. Chart with read bandwidth vs write bandwidth. Emerging NVRAM: FeRAM, eMRAM, RRAM, PRAM.
Complex chart with emerging research memories. Clock frequency vs. Cell Area (cost).
The computer of tomorrow. Memory or storage? In the beginning (core memory), there was no distinction between the two.
We’re moving to an era where you can turn off the computer, turn it back on and there’s something in memory. Do you trust it?
SCM – Storage Class Memory: high performance with archival properties. There are many other terms for it: Persistent Memory, Non-Volatile Memory.
New NVM has disruptively low latency: Log chart with latency budgets for HDD, SATA SSD, NVMe, Persistent. When you go below 10 microseconds (as Persistent does), context switching does not make sense.
Non-blocking I/O. NUMA latencies up to 200ns have been tolerated. Latencies below these cause disruption.
Memory mapped files eliminate file system latency.
The computer of tomorrow. Fixed DRAM size, upgradeable NVM (tomorrow’s DIMM), both flash and disk (flash on PCIe or own bus), much work needed on SCM software
Q: Will all these layers survive? I believe so. There are potential improvements in all of them (cited a few on NAND, HDD).
Q: Shouldn’t we drop one of the layers? Usually, adding layers (not removing them) is more interesting from a cost perspective.
Q: Do we need a new protocol for SCM? NAND did well without much of that. Alternative memories could be put on a memory bus.

Concepts on Moving From SAS connected JBOD to an Ethernet Connected JBOD
Jim Pinkerton, Partner Architect Lead, Microsoft

What if we took a JBOD, a simple device, and just put it on Ethernet?
Re-Thinking the Software-defined Storage conceptual model definition: compute nodes, storage nodes, flakey storage devices
Front-end fabric (Ethernet, IB or FC), Back-end fabric (directly attached or shared storage)
Yesterday’s Storage Architecture: Still highly profitable. Compute nodes, traditional SAN/NAS box (shipped as an appliance)
Today: Software Defined Storage (SDS) – “Converged”. Separate the storage service from the JBOD.
Today: Software Defined Storage (SDS) – “Hyper-Converged” (H-C). Everything ships in a single box. Scale-out architecture.
H-C appliances are a dream for the customer to install/use, but the $/GB storage is high.
Microsoft Cloud Platform System (CPS). Shipped as a packaged deal. Microsoft tested and guaranteed.
SDS with DAS – Storage layer divided into storage front-end (FE) and storage back-end (BE). The two communicate over Ethernet.
SDS Topologies. Going from Converged and Hyper-Converged to a future EBOD topology. From file/block access to device access.
Expose the raw device over Ethernet. The raw device is flaky, but we love it. The storage FE will abstract that, add reliability.
I would like to have an EBOD box that could provide the storage BE.
EBOD works for a variety of access protocols and topologies. Examples: SMB3 “block”, Lustre object store, Ceph object store, NVMe fabric, T10 objects.
Shared SAS Interop. Nightmare experience (disk multi-path interop, expander multi-path interop, HBA distributed failure). This is why customers prefers appliances.
To share or not to share. We want to share, but we do not want shared SAS. Customer deployment is more straightforward, but you have more traffic on Ethernet.
Hyper-Scale cloud tension – fault domain rebuild time. Depends on number of disks behind a node and how much network you have.
Fault domain for storage is too big. Required network speed offsets cost benefits of greater density. Many large disks behind a single node becomes a problem.
Private cloud tension – not enough disks. Entry points at 4 nodes, small number of disks. Again, fault domain is too large.
Goals in refactoring SDS – Storage back-end is a “data mover” (EBOD). Storage front-end is “general purpose CPU”.
EBOD goals – Can you hit a cost point that’s interesting? Reduce storage costs, reduce size of fault domain, build a more robust ecosystem of DAS. Keep topology simple, so customer can build it themselves.
EBOD: High end box, volume box, capacity box.
EBOD volume box should be close to what a JBOD costs. Basically like exposing raw disks.
Comparing current Hyper-Scale to EBOD. EBOD has an NIC and an SOC, in addition to the traditional expander in a JBOD.
EBOD volume box – Small CPU and memory, dual 10GbE, SOC with RDMA NIC/SATA/SAS/PCIe, up to 20 devices, SFF-8639 connector, management (IPMI, DMTF Redfish?)
Volume EBOD Proof Point – Intel Avaton, PCIe Gen 2, Chelsio 10GbE, SAS HBA, SAS SSD. Looking at random read IOPS (local, RDMA remote and non-RDMA remote). Max 159K IOPS w/RDMA, 122K IOPS w/o RDMA. Latency chart showing just a few msec.
EBOD Performance Concept – Big CPU, Dual attach 40GbE, Possibly all NVME attach or SCM. Will show some of the results this afternoon.
EBOD is an interesting approach that’s different from what we’re doing. But it’s nicely aligned with software-defined storage.
Price point of EBOD must be carefully managed, but the low price point enables a smaller fault domain.

Planning for the Next Decade of NVM Programming
Andy Rudoff, SNIA NVM Programming TWG, Intel

Looking at what’s coming up in the next decade, but will start with some history.
Comparison of data storage technologies. Emerging NV technologies with read times in the same order of magnitude as DRAM.
Moving the focus to software latency when using future NVM.
Is it memory or storage? It’s persistent (like storage) and byte-addressable (like memory).
Storage vs persistent memory. Block IO vs. byte addressable, sync/async (DMA master) vs. sync (DMA slave). High capacity vs. growing capacity.
pmem: The new Tier. Byte addressable, but persistent. Not NAND. Can do small I/O. Can DMA to it.
SNIA TWG (lots of companies). Defining the NVM programming model: NVM.PM.FILE mode and NVM.PM.VOLUME mode.
All the OSes created in the last 30 years have a memory mapped file.
Is this stuff real? Why are we spending so much time on this? Yes – Intel 3D XPoint technology, the Intel DIMM. Showed a wafer on stage. 1000x faster than NAND. 1000X endurance of NAND, 10X denser than conventional memory. As much as 6TB of this stuff…
Timeline: Big gap between NAND flash memory (1989) and 3D XPoint (2015).
Diagram of of the model with Management, Block, File and Memory access. Link at the end to the diagram.
Detecting pmem: Defined in the ACPI 6.0. Linux support upstream (generic DIMM driver, DAX, ext4+DAX, KVM). Neal talked about Windows support yesterday.
Heavy OSV involvement in TWG, we wrote the spec together.
We don’t want every application to have to re-architecture itself. That’s why we have block and file there as well.
The next decade
Transparency levels: increasing barrier to adoption. increasing leverage. Could do it in layers. For instance, could be file system only, without app modification. For instance, could modify just the JVM to get significant advantages without changing the apps.
Comparing to multiple cores in hardware and multi-threaded programming. Took a decade or longer, but it’s commonplace now.
One transparent example: pmem Paging. Paging from the OS page cache (diagrams).
Attributes of paging : major page faults, memory looks much larger, page in must pick a victim, many enterprise apps opt-out, interesting example: Java GC.
What would it look like if you paged to pmem instead of paging to storage. I don’t even care that it’s persistent, just that there’s a lot of it.
I could kick a page out synchronously, probably faster than a context switch. But the app could access the data in pmem without swapping it in (that‘s new!). Could have policies for which app lives in which memory. The OS could manage that, with application transparency.
Would this really work? It will when pmem costs less, performance is close, capacity is significant and it is reliable. “We’re going to need a bigger byte” to hold error information.
Not just for pmem. Other memories technologies are emerging. High bandwidth memory, NUMA localities, different NVM technologies.
Extending into user space: NVM Library – pmem.io (64-bit Linux Alpha release). Windows is working on it as well.
That is a non-transparent example. It’s hard (like multi-threading). Things can fail in interesting new ways.
The library makes it easier and some of it is transactional.
No kernel interception point, for things like replication. No chance to hook above or below the file system. You could do it in the library.
Non-transparent use cases: volatile caching, in-memory database, storage appliance write cache, large byte-addressable data structures (hash table, dedup), HPC (checkpointing)
Sweet spots: middleware, libraries, in-kernel usages.
Big challenge: middleware, libraries. Is it worth the complexity.
Building a software ecosystem for pmem, cost vs. benefit challenge.
Prepare yourself: lean NVM programming model, map use cases to pmem, contribute to the libraries, software ecosystem

FS Design Around SMR: Seagate’s Journey and Reference System with EXT4
Adrian Palmer, Drive Development Engineering, Seagate Technologies

SNIA Tutorial. I’m talking about the standard, as opposed as the design of our drive.
SMR is being embraced by everyone, since this is a major change, a game changes.
From random writes to resemble the write profile of sequential-access tape.
1 new condition: forward-write preferred. ZAC/ZBD spec: T10/13. Zones, SCSI ZBC standards, ATA ZAC standards.
What is a file system? Essential software on a system, structured and unstructured data, stores metadata and data.
Basic FS requirements: Write-in-place (superblock, known location on disk), Sequential write (journal), Unrestricted write type (random or sequential)
Drive parameters: Sector (atomic unit of read/write access). Typically 512B size. Independently accessed. Read/write, no state.
Drive parameters: Zone (atomic performant rewrite unit). Typically 256 MiB in size. Indirectly addressed via sector. Modified with ZAC/ZBD commands. Each zone has state (WritePointer, Condition, Size, Type).
Write Profiles. Conventional (random access), Tape (sequential access), Flash (sequential access, erase blocks), SMR HA/HM (sequential access, zones). SMR write profile is similar to Tape and Flash.
Allocation containers. Drive capacities are increasing, location mapping is expensive. 1.56% with 512B blocks or 0.2% with 4KB blocks.
Remap the block device as a… block device. Partitions (w*sector size), Block size (x*sector size), Group size (y*Block size), FS (z*group size, expressed as blocks).
Zones are a good fit to be matched with Groups. Absorb and mirror the metadata, don’t keep querying drive for metadata.
Solving the sequential write problem. Separate the problem spaces with zones.
Dedicate zones to each problem space: user data, file records, indexes, superblock, trees, journal, allocation containers.
GPT/Superblocks: First and last zone (convention, not guaranteed). Update infrequently, and at dismount. Looks at known location and WritePointer. Copy-on-update. Organized wipe and update algorithm.
Journal/soft updates. Update very frequently, 2 or more zones, set up as a circular buffer. Checkpoint at each zone. Wipe and overwrite oldest zone. Can be used as NV cache for metadata. Requires lots of storage space for efficient use and NV.
Group descriptors: Infrequently changed. Changes on zone condition change, resize, free block counts. Write cached, butwritten at WritePointer. Organized as a B+Tree, not an indexed array. The B+Tree needs to be stored on-disk.
File Records: POSIX information (ctime, mtime, atime, msize, fs specific attributes), updated very frequently. Allows records to be modified in memory, written to journal cache, gather from journal, write to new blocks at WritePointer.
Mapping (file records to blocks). File ideally written as a single chunk (single pointer), but could become fragmented (multiple pointers). Can outgrow file record space, needs its own B+Tree. List can be in memory, in the journal, written out to disk at WritePointer.
Data: Copy-on-write. Allocator chooses blocks at WritePointer. Writes are broken at zone boundary, creating new command and new mapping fragment.
Cleanup: Cannot clean up as you go, need a separate step. Each zone will have holes. Garbage collection: Journal GC, Zones GC, Zone Compaction, Defragmentation.
Advanced features: indexes, queries, extended attributes, snapshots, checksums/parity, RAID/JBOD.

Azure File Service: ‘Net Use’ the Cloud
David Goebel, Software Engineer, Microsoft

Agenda: features and API (what), scenarios enabled (why), design of an SMB server not backed by a conventional FS (how)
It’s not the Windows SMB server (srv2.sys). Uses Azure Tables and Azure Blobs for the actual files.
Easier because we already have a highly available and distributed architecture.
SMB 2.1 in preview since last summer. SMB 3.0 (encryption, persistent handles) in progress.
Azure containers mapped as shares. Clients work unmodified out-of-the-box. We implemented the spec.
Share namespace is coherently accessible
MS-SMB2, not SMB1. Anticipates (but does not require) a traditional file system on the other side.
In some ways it’s harder, since what’s there is not a file system. We have multiple tables (for leases, locks, etc). Nice and clean.
SMB is a stateful protocol, while REST is all stateless. Some state is immutable (like FileId), some state is transient (like open counts), some is maintained by the client (like CreateGuid), some state is ephemeral (connection).
Diagram with the big picture. Includes DNS, load balancer, session setup & traffic, front-end node, azure tables and blobs.
Front-end has ephemeral and immutable state. Back-end has solid and fluid durable state.
Diagram with two clients accessing the same file and share, using locks, etc. All the state handled by the back-end.
Losing a front-end node considered a regular event (happens during updates), the client simple reconnects, transparently.
Current state, SMB 2.1 (SMB 3.0 in the works). 5TB per share and 1TB per file. 1,000 8KB IOPS per share, 60MB/sec per share. Some NTFS features not supported, some limitations on characters and path length (due to HTTP/REST restrictions).
Demo: I’m actually running my talk using a PPTX file on Azure File. Robocopy to file share. Delete, watch via explorer (notifications working fine). Watching also via wireshark.
Currently Linux Support. Lists specific versions Ubuntu Server, Ubuntu Core, CentOS, Open SUSE, SUSE Linux Enterprise Server.
Why: They want to move to cloud, but they can’t change their apps. Existing file I/O applications. Most of what was written over the last 30 years “just works”. Minor caveats that will become more minor over time.
Discussed specific details about how permissions are currently implemented. ACL support is coming.
Example: Encryption enabled scenario over the internet.
What about REST? SMB and REST access the same data in the same namespace, so a gradual application transition without disruption is possible. REST for container, directory and file operations.
The durability game. Modified state that normally exists only in server memory, which must be durably committed.
Examples of state tiering: ephemeral state, immutable state, solid durable state, fluid durable state.
Example: Durable Handle Reconnect. Intended for network hiccups, but stretched to also handles front-end reconnects. Limited our ability because of SMB 2.1 protocol compliance.
Example: Persistent Handles. Unlike durable handles, SMB 3 is actually intended to support transparent failover when a front-end dies. Seamless transparent failover.
Resource Links: Getting started blog (http://blogs.msdn.com/b/windowsazurestorage/archive/2014/05/12/introducing-microsoft-azure-file-service.aspx) , NTFS features currently not supported (https://msdn.microsoft.com/en-us/library/azure/dn744326.aspx), naming restrictions for REST compatibility (https://msdn.microsoft.com/library/azure/dn167011.aspx).

Software Defined Storage – What Does it Look Like in 3 Years?
Richard McDougall, Big Data and Storage Chief Scientist, VMware

How do you come up with a common, generic storage platform that serves the needs of application?
Bringing a definition of SDS. Major trends in hardware, what the apps are doing, cloud platforms
Storage workloads map. Many apps on 4 quadrants on 2 axis: capacity (10’s of Terabytes to 10’s of Petabytes) and IOPS (1K to 1M)
What are cloud-native applications? Developer access via API, continuous integration and deployment, built for scale, availability architected in the app, microservices instead of monolithic stacks, decoupled from infrastructure
What do Linux containers need from storage? Copy/clone root images, isolated namespace, QoS controls
Options to deliver storage to containers: copy whole root tree (primitive), fast clone using shared read-only images, clone via “Another Union File System” (aufs), leverage native copy-on-write file system.
Shared data: Containers can share file system within host or across hots (new interest in distributed file systems)
Docker storage abstractions for containers: non-persistent boot environment, persistent data (backed by block volumes)
Container storage use cases: unshared volumes, shared volumes, persist to external storage (API to cloud storage)
Eliminate the silos: converged big data platform. Diagram shows Hadoop, HBase, Impala, Pivotal HawQ, Cassandra, Mongo, many others. HDFS, MAPR, GPFS, POSIX, block storage. Storage system common across all these, with the right access mechanism.
Back to the quadrants based on capacity and IOPS. Now with hardware solutions instead of software. Many flash appliances in the upper left (low capacity, high IOPS). Isilon in the lower right (high capacity, low IOPS).
Storage media technologies in 2016. Pyramid with latency, capacity per device, capacity per host for each layer: DRAM (1TB/device, 4TB/host, ~100ns latency), NVM (1TB, 4TB, ~500ns), NVMe SSD (4TB, 48TB, ~10us), capacity SSD (16TB, 192TB, ~1ms), magnetic storage (32TB, 384TB, ~10ms), object storage (?, ?, ~1s).
Back to the quadrants based on capacity and IOPS. Now with storage media technologies.
Details on the types of NVDIMM (NVIMM-N – Type 1, NVDIMM-F – Type 2, Type 4). Standards coming up for all of these. Needs work to virtualize those, so they show up properly inside VMs.
Intel 3D XPoint Technology.
What are the SDS solutions than can sit on top of all this? Back to quadrants with SDS solutions. Nexenta, Mentions ScaleiO, VSAN, ceph, Scality, MAPR, HDFS. Can you make one solution that works well for everything?
What’s really behind a storage array? The value from the customer is that it’s all from one vendor and it all works. Nothing magic, but the vendor spent a ton of time on testing.
Types of SDS: Fail-over software on commodity servers (lists many vendors), complexity in hardware, interconnects. Issues with hardware compatibility.
Types of SDS: Software replication using servers + local disks. Simpler, but not very scalable.
Types of SDS: Caching hot core/cold edge. NVMe flash devices up front, something slower behind it (even cloud). Several solutions, mostly startups.
Types of SDS: Scale-out SDS. Scalable, fault-tolerant, rolling updates. More management, separate compute and storage silos. Model used by ceph, ScaleiO. Issues with hardware compatibility. You really need to test the hardware.
Types of SDS: Hyper-converged SDS. Easy management, scalable, fault-tolerant, rolling upgrades. Fixed compute to storage ration. Model used by VSAN, Nutanix. Amount of variance in hardware still a problem. Need to invest in HCL verification.
Storage interconnects. Lots of discussion on what’s the right direction. Protocols (iSCSI, FC, FCoE, NVMe, NVMe over Fabrics), Hardware transports (FC, Ethernet, IB, SAS), Device connectivity (SATA, SAS, NVMe)
Network. iSCSI, iSER, FCoE, RDMA over Ethernet, NVMe Fabrics. Can storage use the network? RDMA debate for years. We’re at a tipping point.
Device interconnects: HCA with SATA/SAS. NVMe SSD, NVM over PCIe. Comparing iSCSI, FCoE and NVMe over Ethernet.
PCIe rack-level Fabric. Devices become addressable. PCIe rack-scale compute and storage, with host-to-host RDMA.
NVMe – The new kid on the block. Support from various vendors. Quickly becoming the all-purpose stack for storage, becoming the universal standard for talking block.
Beyond block: SDS Service Platforms. Back to the 4 quadrants, now with service platforms.
Too many silos: block, object, database, key-value, big data. Each one is its own silo with its own machines, management stack, HCLs. No sharing of infrastructure.
Option 1: Multi-purpose stack. Has everything we talked about, but it’s a compromise.
Option 2: Common platform + ecosystem of services. Richest, best-of-breed services, on a single platform, manageable, shared resources.

Why the Storage You Have is Not the Storage Your Data Needs
Laz Vekiarides, CTO and Co-founder, ClearSky Data

ClearSky Data is a tech company, consumes what we discussed in this conference.
The problem we’re trying to solve is the management of the storage silos
Enterprise storage today. Chart: Capacity vs. $/TB. Flash, Mid-Range, Scale-Out. Complex, costly silos
Describe the lifecycle of the data, the many copies you make over time, the rebuilding and re-buying of infrastructure
What enterprises want: buy just enough of the infrastructure, with enough performance, availability, security.
Cloud economics – pay only for the stuff that you use, you don’t have to see all the gear behind the storage, someone does the physical management
Tiering is a bad answer – Nothing remains static. How fast does hot data cool? How fast does it re-warm? What is the overhead to manage it? It’s a huge overhead. It’s not just a bandwidth problem.
It’s the latency, stupid. Data travels at the speed of light. Fast, but finite. Boston to San Francisco: 29.4 milliseconds of round-trip time (best case). Reality (with switches, routers, protocols, virtualization) is more like 70 ms.
So, where exactly is the cloud? Amazon East is near Ashburn, VA. Best case is 10ms RTT. Worst case is ~150ms (does not include time to actually access the storage).
ClearSky solution: a global storage network. The infrastructure becomes invisible to you, what you see is a service level agreement.
Solution: Geo-distributed data caching. Customer SAN, Edge, Metro POP, Cloud. Cache on the edge (all flash), cache on the metro POP.
Edge to Metro POP are private lines (sub millisecond latency). Addressable market is the set of customers within a certain distance to the Metro POP.
Latency math: Less than 1ms to the Metro POP, cache miss path is between 25ms and 50ms.
Space Management: Edge (hot, 10%, 1 copy), POP (warm, <30%, 1-2 copies), Cloud (100%, n copies). All data is deduplicated and encrypted.
Modeling cache performance: Miss ratio curve (MRC). Performance as f(size), working set knees, inform allocation policy.
Reuse distance (unique intervening blocks between use and reuse). LRU is most of what’s out there. Look at stacking algorithms. Chart on cache size vs. miss ratio. There’s a talk on this tomorrow by CloudPhysics.
Worked with customers to create a heat map data collector. Sizing tool for VM environments. Collected 3-9 days of workload.
~1,400 virtual disks, ~800 VMs, 18.9TB (68% full), avg read IOPS 5.2K, write IOPS 5.9K. Read IO 36KB, write IO 110KB. Read Latency 9.7ms, write latency 4.5ms.
This is average latency, maximum is interesting, some are off the chart. Some were hundred of ms, even 2 second.
Computing the cache miss ratio. How much cache would we need to get about 90% hit ratio? Could do it with less than 12% of the total.
What is cache hit for writes? What fits in the write-back cache. You don’t want to be synchronous with the cloud. You’ll go bankrupt that way.
Importance of the warm tier. Hot data (Edge, on prem, SSD) = 12%, warm data (Metro PoP, SSD and HDD) = 6%, cold data (Cloud) = 82%. Shown as a “donut”.
Yes, this works! We’re having a very successful outcome with the customers currently engaged.
Data access is very tiered. Small amounts of flash can yield disproportionate performance benefits. Single tier cache in front of high latency storage can’t work. Network latency is as important as bounding media latency.
Make sure your caching is simple. Sometimes you are overthinking it.
Identifying application patterns is hard. Try to identify the sets of LBA that are accessed. Identify hot spots, which change over time. The shape of the miss ratio remains similar.

Emerging Trends in Software Development
Donnie Berkholz, Research Director, 451 Research

How people are building applications. How storage developers are creating and shipping software.
Technology adoption is increasingly bottom-up. Open source, cloud. Used to be like building a cathedral, now it’s more like a bazaar.
App-dev workloads are quickly moving to the cloud. Chart from all-on-prem at the top to all-cloud at the bottom.
All on-prem going from 59% now to 37% in a few years. Moving to different types of clouds (private cloud, Public cloud (IaaS), Public cloud (SaaS).
Showing charts for total data at organization, how much in off-premises cloud (TB and %). 64% of people have less than 20% on the cloud.
The new stack. There’s a lot of fragmentation. 10 languages in the top 80%. Used to be only 3 languages. Same thing for databases. It’s more composable, right tool for the right job.
No single stack. An infinite set of possibilities.
Growth in Web APIs charted since 2005 (from ProgrammableWeb). Huge growth.
What do enterprises think of storage vendors. Top vendors. People not particularly happy with their storage vendors. Promise index vs. fulfillment index.
Development trends that will transform storage.
Containers. Docker, docker, docker. Whale logos everywhere. When does it really make sense to use VMs or containers? You need lots of random I/O for these to work well. 10,000 containers in a cluster? Where do the databases go?
Developers love Docker. Chart on configuration management GitHub totals (CFEngine, Puppet, Chef, Ansible, Salt, Docker). Shows developer adoption. Docker is off the charts.
It’s not just a toy. Survey of 1,000 people on containers. Docker is only 2.5 years old now. 20% no plans, 56% evaluating. Total doing pilot or more add up to 21%. That’s really fast adoption
Docker to microservices.
Amazon: “Every single data transfer between teams has to happen through an API or you’re fired”. Avoid sending spreadsheets around.
Microservices thinking is more business-oriented, as opposed to technology-oriented.
Loosely couple teams. Team organization has a great influence in your development.
The foundation of microservices. Terraform, MANTL, Apache Mesos, Capgemini Appollo, Amazon EC2 Container Service.
It’s a lot about scheduling. Number of schedulers that use available resources. Makes storage even more random.
Disruption in data processing. Spark. It’s a competitor to Hadoop, really good at caching in memory, also very fast on disk. 10x faster than map-reduce. People don’t have to be big data experts. Chart: Spark came out of nowhere (mining data from several public forums).
The market is coming. Hadoop market as a whole growing 46% (CAGR).
Storage-class memory. Picture of 3D XPoint. Do app developer care? Not sure. Not many optimize for cache lines in memory. Thinking about Redis in-memory database for caching. Developers probably will use SCM that way. Caching in the order of TB instead of GB.
Network will be incredibly important. Moving bottlenecks around.
Concurrency for developers. Chart of years vs. Percentage of Ohlon. Getting near to 1%. That’s a lot single the most popular is around 10%.
Development trends
DevOps. Taking agile development all the way to production. Agile, truly tip to tail. You want to iterate while involving your customers. Already happening with startups, but how do you scale?
DevOps: Culture, Automation (Pets vs. Cattle), Measurement
Automation: infrastructure as code. Continuous delivery.
Measurement: Nagios, graphite, Graylog2, splunk, Kibana, Sensu, etsy/statsd
DevOps is reaching DBAs. #1 stakeholder in recent survey.
One of the most popular team structure change. Dispersing the storage team.
The changing role of standards
The changing role of benchmarks. Torturing databases for fun and profit.
I would love for you to join our panel. If you fill our surveys, you get a lot of data for free.

Learnings from Nearly a Decade of Building Low-cost Cloud Storage
Gleb Budman, CEO, Backblaze

What we learned, specifically the cost equation
150+ PB of customer data. 10B files.
In 2007 we wanted to build something that would backup your PC/Mac data to the cloud. $5/month.
Originally we wanted to put it all on S3, but we would lose money on every single customer.
Next we wanted to buy SANs to put the data on, but that did not make sense either.
We tried a whole bunch of things. NAS, USB-connected drives, etc.
Cloud storage has a new player, with a shockingly low price: B2. One fourth of the cost of S3.
Lower than Glacier, Nearline, S3-Infrequent Access, anything out there. Savings here add up.
Datacenter: convert kilowatts-to-kilobits
Datacenter Consideration: local cost to power, real state, taxes, climate, building/system efficiency, proximity to good people, connectivity.
Hardware: Connect hard drives to the internet, with as little as possible in between.
Blackblaze storage box, costs about $3K. As simple as possible, don’t make the hardware itself redundant. Use commodity parts (example: desktop power supply), use consumer hard drives, insource & use math for drive purchases.
They told us we could not use consumer hard drives. But reality is that the failure rate was actually lower. They last 6 years on average. Even if the enterprise HDD never fail, they still don’t make sense.
Insource & use math for drive purchases. Drives are the bulk of the cost. Chart with time vs. price per gigabyte. Talking about the Thailand Hard Drive Crisis.
Software: Put all intelligence here.
Blackblaze vault: 20 hard drives create 1 tome that share parts of a file, spread across racks.
Avoid choke point. Every single storage pods is a first class citizen. We can parallelize.
Algorithmically monitor SMART stats. Know which SMART codes correlate to annual failure rate. All the data is available on the site (all the codes for all the drives). https://www.backblaze.com/SMART
Plan for silent corruption. Bad drive looks exactly like a good drive.
Put replication above the file system.
Run out of resources simultaneous. Hardware and software together. Avoid having CPU pegged and your memory unused. Have your resources in balance, tweak over time.
Model and monitor storage burn. It’s important not to have too much or too little storage. Leading indicator is not storage, it’s bandwidth.
Business processes. Design for failure, but fix failures quickly. Drives will die, it’s what happens at scale.
Create repeatable repairs. Avoid the need for specialized people to do repair. Simple procedures: either swap a drive or swap a pod. Requires 5 minutes of training.
Standardize on the pod chassis. Simplifies so many things…
Use ROI to drive automation. Sometimes doing things twice is cheaper than automation. Know when it makes sense.
Workflow for storage buffer. Treat buffer in days, not TB. Model how many days of space available you need. Break into three different buffer types: live and running vs. in stock but not live vs. parts.
Culture: question “conventional wisdom”. No hardware worshippers. We love our red storage boxes, but we are a software team.
Agile extends to hardware. Storage Pod Scrum, with product backlog, sprints, etc.
Relentless focus on cost: Is it required? Is there a comparable lower cost option? Can business processes work around it? Can software work around it?

f4: Facebook’s Warm BLOB Storage System
Satadru Pan, Software Engineer, Facebook

White paper “f4: Facebook’s Warm BLOB Storage System” at http://www-bcf.usc.edu/~wyattllo/papers/f4-osdi14.pdf
Looking at how data cools over time. 100x drop in reads in 60 days.
Handling failure. Replication: 1.2 * 3 = 3.6. To lose data we need to lose 9 disks or 3 hosts. Hosts in different racks and datacenters.
Handling load. Load spread across 3 hosts.
Background: Data serving. CDN protects storage, router abstracts storage, web tier adds business logic.
Background: Haystack [OSDI2010]. Volume is a series of blobs. In-memory index.
Introducing f4: Haystack on cells. Cells = disks spread over a set of racks. Some compute resource in each cell. Tolerant to disk, host, rack or cell failures.
Data splitting: Split data into smaller blocks. Reed Solomon encoding, Create stripes with 5 data blocks and 2 parity blocks.
Blobs laid out sequentially in a block. Blobs do not cross block boundary. Can also rebuild blob, might not need to read all of the block.
Each stripe in a different rack. Each block/blob split into racks. Mirror to another cell. 14 racks involved.
Read. Router does Index read, Gets physical location (host, filename, offset). Router does data read. If data read fails, router sends request to compute (decoders).
Read under datacenter failure. Replica cell in a different data center. Router proxies read to a mirror cell.
Cross datacenter XOR. Third cell has a byte-by-byte XOR of the first two. Now mix this across 3 cells (triplet). Each has 67% data and 33% replica. 1.5 * 1.4 = 2.1X.
Looking at reads with datacenter XOR. Router sends two read requests to two local routers. Builds the data from the reads from the two cells.
Replication factors: Haystack with 3 copies (3.6X), f4 2.8 (2.8X), f4 2.1 (2.1X). Reduced replication factor, increased fault tolerance, increase load split.
Evaluation. What and how much data is “warm”?
CDN data: 1 day, 0.5 sampling. BLOB storage data: 2 week, 0.1%, Random distribution of blobs assumed, the worst case rates reported.
Hot data vs. Warm data. 1 week – 350 reads/sec/disk, 1 month – 150r/d/s, 3 months – 70r/d/s, 1 year 20r/d/s. Wants to keep above 80 reads/sec/disk. So chose 3 months as divider between hot and warm.
It is warm, not cold. Chart of blob age vs access. Even old data is read.
F4 performance: most loaded disk in cluster: 35 reads/second. Well below the 80r/s threshold.
F4 performance: latency. Chart of latency vs. read response. F4 is close to Haystack.
Conclusions. Facebook blob storage is big and growing. Blobs cool down with age very rapidly. 100x drop in reads in 60 days. Haystack 3.6 replication over provisioning for old, warm data. F4 encodes data to lower replication to 2.1X, without compromising performance significantly.

Pelican: A Building Block for Exascale Cold Data Storage
Austin Donnelly, Principal Research Software Development Engineer, Microsoft

White paper “Pelican: A building block for exascale cold data storage” at http://research.microsoft.com/pubs/230697/osdi2014-Pelican.pdf
This is research, not a product. No product announcement here. This is a science project that we offer to the product teams.
Background: Cold data in the cloud. Latency (ms. To hours) vs. frequency of access. SSD, 15K rpm HDD, 7.2K rpm HDD, Tape.
Defining hot, warm, archival tiers. There is a gap between warm and archival. That’s were Pelican (Cold) lives.
Pelican: Rack-scale co-design. Hardware and software (power, cooling, mechanical, HDD, software). Trade latency for lower cost. Massive density, low per-drive overhead.
Pelican rack: 52U, 1152 3.5” HDD. 2 servers, PCIe bus stretched rack wide. 4 x 10Gb links. Only 8% of disks can spin.
Looking at pictures of the rack. Very little there. Not many cables.
Interconnect details. Port multiplier, SATA controller, Backplane switch (PCIe), server switches, server, datacenter network. Showing bandwidth between each.
Research challenges: Not enough cooling, power, bandwidth.
Resource use: Traditional systems can have all disks running at once. In Pelican, a disk is part of a domain: power (2 of 16), cooling (1 of 12), vibration (1 of 2), bandwidth (tree).
Data placement: blob erasure-encoded on a set of concurrently active disks. Sets can conflict in resource requirement.
Data placement: random is pretty bad for Pelican. Intuition: concentrate conflicts over a few set of disks. 48 groups of 24 disk. 4 classes of 12 fully-conflicting groups. Blob storage over 18 disks (15+3 erasure coding).
IO scheduling: “spin up is the new seek”. All our IO is sequential, so we only need to optimize for spin up. Four schedulers, with 12 groups per scheduler, only one active at a time.
Naïve scheduler: FIFO. Pelican scheduler: request batching – trade between throughput and fairness.
Q: Would this much spin up and down reduce endurance of the disk. We’re studying it, not conclusive yet, but looking promising so far.
Q: What kind of drive? Archive drives, not enterprise drives.
Demo. Showing system with 36 HBAs in device manager. Showing Pelican visualization tool. Shows trays, drives, requests. Color-coded for status.
Demo. Writing one file: drives spin up, request completes, drives spin down. Reading one file: drives spin up, read completes, drives spin down.
Performance. Compare Pelican to a mythical beast. Results based on simulation.
Simulator cross-validation. Burst workload.
Rack throughput. Fully provisioned vs. Pelican vs. Random placement. Pelican works like fully provisioned up to 4 requests/second.
Time to first byte. Pelican adds spin-up time (14.2 seconds).
Power consumption. Comparing all disks on standby (1.8kW) vs. all disks active (10.8kW) vs. Pelican (3.7kW).
Trace replay: European Center for Medium-range Weather Forecast. Every request for 2.4 years. Run through the simulator. Tiering model. Tiered system with Primary storage, cache and pelican.
Trace replay: Plotting highest response time for a 2h period. Response time was not bad, simulator close to the rack.
Trace replay: Plotting deepest queues for a 2h period. Again, simulator close to the rack.
War stories. Booting a system with 1152 disks (BIOS changes needed). Port multiplier – port 0 (firmware change needed). Data model for system (serial numbers for everything). Things to track: slots, volumes, media.

Torturing Databases for Fun and Profit
Mai Zheng, Assistant Professor Computer Science Department – College of Arts and Sciences, New Mexico State University

White paper “Torturing Databases for Fun and Profit” at https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-zheng_mai.pdf
Databases are used to store important data. Should provide ACID properties: atomicity, consistency, isolation, durability – even under failures.
List of databases that passed the tests: <none>. Everything is broken under simulated power faults.
Power outages are not that uncommon. Several high profile examples shown.
Fault model: clean termination of I/O stream. Model does not introduce corruption/dropping/reorder.
How to test: Connect database to iSCSI target, then decouple the database from the iSCSI target.
Workload example. Key/value table. 2 threads, 2 transactions per thread.
Known initial state, each transaction updates N random work rows and 1 meta row. Fully exercise concurrency control.
Simulates power fault during our workload. Is there any ACID violation after recovery? Found atomicity violation.
Capture I/O trace without kernel modification. Construct a post-fault disk image. Check the post-fault DB.
This makes testing different fault points easy. But enhanced it with more context, to figure out what makes some fault points special.
With that, five patterns found. Unintended update to the mmap’ed blocks. Pattern-based ranking of where fault injections will lead to pattern.
Evaluated 8 databases (open source and commercial). Not a single database could survive.
The most common violation was durability. Some violations are difficult to trigger, but the framework helped.
Case study: A TokyoCabinet Bug. Looking at the fault and why the database recovery did not work.
Pattern-based fault injection greatly reduced test points while achieving similar coverage.
Wake up call: Traditional testing methodology may not be enough for today’s complex storage systems.
Thorough testing requires purpose-built workloads and intelligent fault injection techniques.
Different layers in the OS can help in different ways. For instance, iSCSI is an ideal place for fault injection.
We should bridge the gaps in understanding and assumptions. For instance, durability might not be provided by the default DB configuration.

Personal Cloud Self-Protecting Self-Encrypting Storage Devices
Robert Thibadeau, Ph.D., Scientist and Entrepreneur, CMU, Bright Plaza
http://www.snia.org/sites/default/files/DSS-Summit-2015/presentations/RobertThibadeau_Personal%20Cloud.pdf

This talk is about personal devices, not enterprise storage.
The age of uncontrolled data leaks. Long list of major hacks recently. All phishing initiated.
Security ~= Access Control. Security should SERVE UP privacy.
Computer security ~= IPAAAA, Integrity, Private, Authentication, Authorization, Audit, Availability. The first 3 are encryption, the other aren’t.
A storage device is a computing device. Primary host interface, firmware, special hardware functions, diagnostic parts, probe points.
For years, there was a scripting language inside the drives.
TCG Core Spec. Core (Data Structures, Basic Operations) + Scripting (Amazing use cases).
Security Provider: Admin, Locking, Clock, Forensic Logging, Crypto services, internal controls, others.
What is an SED (Self-Encrypting Device)? Drive Trust Alliance definition: Device uses built-in hardware encryption circuits to read/write data in/out of NV storage.
At least one Media Encryption Key (MEK) is protected by at least one Key Encryption Key (KEK, usually a “password”).
Self-Encrypting Storage. Personal Storage Landscape. People don’t realize how successful it is.
All self-encrypting today: 100% of all SSDs, 100% of all enterprise storage (HDD, SSD, etc), all iOS devices, 100% of WD USB HDDs,
Much smaller number of personal HDDs are Opal or SED. But Microsoft Bitlocker supports “eDrive” = Opal 2.0 drives of all kinds.
You lose 40% of performance of a phone if you’re doing software encryption. You must do it in hardware.
Working on NVM right now.
Drive Trust Alliance: sole purpose to facilitate adoption of Personal SED. www.drivetrust.org
SP-SED Rule 1 – When we talk about cloud things, every personal device is actually in the cloud so… Look in the clouds for what should be in personal storage devices.
TCG SED Range. Essentially partitions in the storage devices that have their own key. Bitlocker eDrive – 4 ranges. US Government uses DTA open source for creating resilient PCs using ranges. BYOD and Ransomware protection containers.
Personal Data Storage (PDS). All data you want to protect can be permitted to be queried under your control.
Example: You can ask if you are over 21, but not what your birthday is or how old you are, although data is in your PDS.
MIT Media Lab, OpenPDS open source offered by Kerberos Consortium at MIT.
Homomorphic Encryption. How can you do computing operations on encrypted data without ever decrypting the data. PDS: Ask questions without any possibility of getting at the data.
It’s so simple, but really hard to get your mind wrapped around it. The requests come encrypted, results are encrypted and you can never see the plaintext over the line.
General solution was discovered but it is not computationally infeasible (like Bitcoin). Only in the last few years (2011) it improved.
HE Cloud Model and SP-DED Model. Uses OAuth. You can create personal data and you can get access to questions to your personal data. No plain text.
Solution for Homomorphic Encryption. Examples – several copies of the data. Multiple encryption schemes. Each operation (Search, Addition, Multiplication) uses a different scheme.
There’s a lot of technical work on this now. Your database will grow a lot to accommodate these kinds of operations.
SP-SED Rule 2 – Like the internet cloud: if anybody can make money off an SP-SED, then people get really smart really fast… SP-SED should charge $$ for access to the private data they protect.
The TCG Core Spec was written with this in mind. PDS and Homomorphic Encryption provide a conceptual path.
Challenges to you: The TCG Core was designed to provide service identical to the Apple App Store, but in Self-Protecting Storage devices. Every personal storage device should let the owner of the device make money off his private data on it.

Hitachi Data Systems – Security Directions and Trends
Eric Hibbard, Chair SNIA Security Technical Working Group, CTO Security and Privacy HDS

Protecting critical infrastructure. No agreement on what is critical.
What are the sections of critical infrastructure (CI)? Some commonality, but no agreement. US=16 sectors, CA=10, EU=12, UK=9, JP=10.
US Critical Infrastructure. Less than 20% controlled by the government. Significant vulnerabilities. Good news is that cybersecurity is a focus now. Bad news: a lot of interdependencies (lots of things depend on electric power).
Threat landscape for CI. Extreme weather, pandemics, terrorism, accidents/technical failures, cyber threats.
CI Protection – Catapulted to the forefront. Several incidents, widespread concern, edge of cyber-warfare, state-sponsored actions.
President Obama declared a National Emergency on 04/01/2015 due to rising number of cyberattacks.
CI protection initiatives. CI Decision-making organizations, CIP decisions. CIP decision-support system. The goal is to learn from attacks, go back and analyze what we could have done better.
Where is the US public sector going? Rethinking strategy, know what to protect, understand value of information, beyond perimeter security, cooperation.
Disruptive technologies: Mobile computing, cloud computing, machine-to-machine, big data analytics, industrial internet, Internet of things, Industry 4.0, software defined “anything”. There are security and privacy issues for each. Complexity compounded if used together.
M2M maturity. Machine-to-machine communication between devices that are extremely intelligent, maybe AI.
M2M analytics building block. Big Data + M2M. This is the heart and soul of smart cities. This must be secured.
IoT. 50 billion connected objects expected by 2020. These will stay around for a long time. What if they are vulnerable and inside a wall?
IoT will drive big data adoption. Real time and accurate data sensing. They will know where you are at any point in time.
CI and emerging technology. IoT helps reduce cost, but it increases risks.
Social Infrastructure (Hitachi View). Looking at all kinds of technologies and their interplay. It requires collaborative system.
Securing smart sustainable cities. Complex systems, lots of IoT and cloud and big data, highly vulnerable. How to secure them?

Enterprise Key Management & KMIP: The Real Story – Q&A with EKM Vendors
Moderator: Tony Cox, Chair SNIA Storage Security Industry Forum, Chair OASIS KMIP Technical Committee
Panelists: Tim Hudson, CTO, Cryptsoft
Nathan Turajski, Senior Product Manager, HP
Bob Lockhart, Chief Solutions Architect, Thales e-Security, Inc
Liz Townsend, Director of Business Development, Townsend Security
Imam Sheikh, Director of Product Management, Vormetric Inc

Goal: Q&A to explore perspective in EKM, KMIP.
What are the most critical concerns and barriers to adoption?

Some of developers that built the solution are no longer there. Key repository is an Excel spreadsheet. Need to explain that there are better key management solutions.
Different teams see this differently (security, storage). Need a set of requirements across teams.
Concern with using multiple vendors, interoperability.
Getting the right folks educated about basic key management, standards, how to evaluate solutions.
Understanding the existing solutions already implemented.

Would you say that the OASIS key management has progressed to a point where it can be implemented with multiple venders?

Yes, we have demonstrated this many times.
Trend to use KMIP to pull keys down from repository.
Different vendors excel in different areas and complex system do use multiple vendors.
We have seen migrations from one vendor to another. The interoperability is real.
KMIP has become a cost of entry. Vendors that do not implement it are being displaced.
It’s not just storage. Mobile and Cloud as well.

What’s driving customer purchasing? Is it proactive or reactive? With interoperability, where is the differentiation?

It’s a mix of proactive and reactive. Each vendor has different background and different strengths (performance, clustering models). There are also existing vendor relationships.
Organizations still buy for specific applications.
It’s mixed, but some customers are planning two years down the line. One vendor might not be able to solve all the problems.
Compliance is driving a lot of the proactive work, although meeting compliance is a low bar.
Storage drives a lot of it, storage encryption drives a lot of it.

What benefits are customers looking for when moving to KMIP? Bad guy getting to the key, good guy losing the key, reliably forget the key to erase data?

There’s quote a mix of priorities. The operational requirements not to disrupt operations. Assurances that a key has been destroyed and are not kept anywhere.
Those were all possible before. KMIP is about making those things easier to use and integrate.
Motivation is to follow the standard, auditing key transitions across different vendors.

When I look at the EU regulation, cloud computing federating key management. Is KMIP going to scale to billions of keys in the future?

We have vendors that work today with tens of billions of key and moving beyond that. The underlying technology to handle federation is there, the products will mature over time.
It might actually be trillions of keys, when you count all the applications like the smart cities, infrastructure.

When LDAP is fully secure and everything is encrypted. How does secure and unsecure merge?

Having conversations about different levels of protections for different attributes and objects.

What is the different from a local key management to a remote or centralized approaches?

There are lots of best practices in the high scale solutions (like separation of duties), and not all of them are there for the local solution.
I don’t like to use simple and enterprise to classify. It’s better to call them weak and strong.
There are scenarios where the key needs to local for some reason, but need to secure the key, maybe have a hybrid solution with a cloud component.
Some enterprises think in terms of individual projects, local key management. If they step back, they will see the many applications and move to centralized.

With the number of keys grows will we need a lot more repositories with more interop?

Yes. It is more and more a requirement, like in cloud and mobile.
Use KMIP layer to communicate between them.

We’re familiar with use cases? What about abuse cases? How to protect that infrastructure?

It goes back to not doing security by obscurity.
You use a standard and audit the accesses. The system will be able to audit, analyze and alert you when it sees these abuses.
The repository has to be secure, with two-factor authentication, real time monitoring, allow lists for who can access the system. Multiple people to control your key sets.
Key management is part of the security strategy, which needs to be multi-layered.
Simple systems and a common language is a vector for attack, but we need to do it.
Key management and encryption is not the end all and be all. There must be multiple layers. Firewall, access control, audit, logging, etc. It needs to be comprehensive.

Lessons Learned from the 2015 Verizon Data Breach Investigations Report
Suzanne Widup, Senior Analyst, Verizon
http://www.snia.org/sites/default/files/DSS-Summit-2015/presentations/SuzanneWidupLearned_Lessons_Verizon.pdf

Fact based research, gleaned from case reports. Second year that we used data visualization. Report at http://www.verizonenterprise.com/DBIR/2015/
2015 DBIR: 70 contributed organizations, 79,790 security incidents, 2,122 confirmed data breaches, 61 countries
The VERIS framework (actor – who did it, action – how they did it, asset – what was affected, attribute – how it was affected). Given away for free.
We can’t share all the data. But some if it publicly disclosed and it’s in a GitHub repository as JSON files. http://www.vcdb.org.
You can be a part of it. Vcdb.org needs volunteers – be a security hero.
Looking at incidents vs. breaches. Divided by industry. Some industries have higher vulnerabilities, but a part of it is due to visibility.
Which industries exhibit similar threat profiles? There might be other industries that look similar to yours…
Zooming into healthcare and other industries with similar threat profiles.
Threat actors. Mostly external. Less than 20% internal.
Threat actions. Credentials (down), RAM scrapers (up), spyware/keyloggers (down), phishing (up).
The detection deficit. Overall trend is still pretty depressing. The bad guys are innovating faster than we are.
Discovery time line (from 2015). Mostly discovered in days or less.
The impact of breaches. We’re were not equipped to measure impact before. This year we partnered with insurance partners. We only have 50% of what is going on here.
Plotting the impact of breaches. If you look at the number of incidents, it was going down. If you look at the records lost, it is growing.
Charting number of records (1 to 100M) vs. expected loss (US$). There is a band from optimist to pessimist.
The nefarious nine: misc errors, crimeware, privilege misuse, lost/stolen assets, web applications, denial of service, cyber-espionage, point of sale, payment card skimmers.
Looks different if you use just breaches instead of all incidents. Point of sale is higher, for instance.
All incidents, charted over time (graphics are fun!)
More charts. Actors and the nine patterns. Breaches by industry.
Detailed look at point of sale (highest in accommodation, entertainment and retail), crimeware, cyber-espionage (lots of phishing), insider and privilege misuse (financial motivation), lost/stolen devices, denial of service.
Threat intelligence. Share early so it’s actionable.
Phishing for hire companies (23% of recipients open phishing messages, 11% click on attachments)
10 CVEs account for 97% of exploits. Pay attention to the old vulnerabilities.
Mobile malware. Android “wins” over iOS.
Two-factor authentication and patching web servers mitigates 24% of vulnerabilities each.

↧

My Top Reasons to Use OneDrive

January 26, 2016, 12:55 am

≫ Next: Perhaps OneDrive

≪ Previous: Raw notes from the Storage Developer Conference 2015 (SNIA SDC 2015)

As you might have noticed, I am now in the OneDrive team. Since I’ve been here for a few months, I think I earned the right to start sharing a few blogs about OneDrive. I’ll do that over the next few months, focusing on the user’s view of OneDrive (as opposed to the view we have from the inside).

To get things started, this post shares my top reasons to use OneDrive. As you probably already heard, OneDrive is a cloud storage solution by Microsoft. You can upload, download, sync, and share files from your PC, Mac, Phone or Tablet. Here are a few reasons why I like to use OneDrive.

1) Your files in the cloud.The most common reason for using OneDrive is to upload or synchronize your local data to the cloud. This will give you one extra copy of your documents, pictures and videos, which you could use if your computer breaks. Remember the 3-2-1 rules: have 3 copies of your important files, 2 in different media, 1 in another site. For instance, you could have one copy of your files on your PC, one copy in an external drive and one copy in OneDrive.

2) View and edit Office documents. OneDrive offers a great web interface that you can access anywhere you have a OneDrive client or using the http://onedrive.comweb site. The site includes viewers for common data types like videos and pictures. For your Office documents, you can use the great new Office apps for Windows, Mac OSX, Windows Phone, iOS and Android. You can also use the web versions of Word, Excel, PowerPoint or OneNote right from the OneDrive.com web site to create, view and edit your documents (even if Office is not installed on the machine).

3) Share files with others. Once your data is in the cloud, you have the option to share a file or an entire folder with others. You can use this to share pictures with your family or to share a document with a colleague. It’s simple to share, simple to access and you can stop sharing at any time. OneDrive has a handy feature to show files shared with you as part of your drive and it’s quite useful.

4) Upload your photos automatically. If you use a phone or tablet to take pictures and video, you can configure it to automatically upload them to OneDrive. This way your cherished memories will be preserved in the cloud. If you’re on vacation and you phone is lost or stolen, you can replace the phone, knowing that your files were already preserved. We have OneDrive clients for Windows Phone, iOS and Android.

5) Keep in sync across devices. If you have multiple computers, you know how hard it is to keep data in sync. With OneDrive, you can keep your desktop, you laptop and your tablet in sync, automatically. We have OneDrive sync clients for Windows and Mac OSX. You also have an option to sync only a subset of your folders. This will help you have all files on a computer with a large drive, but only a few folders on another computer with limited storage.

6) Search.OneDrive offers a handy search feature that can help you find any of your files. Beyond simply searching for document names or text inside your documents, OneDrive will index the text inside your pictures, the types of picture (using tags like #mountain, #people, #car or #building) or the place where a picture was taken.

Did I forget something important? Use the comments to share other reasons why you like to use OneDrive…

↧

Perhaps OneDrive

February 8, 2016, 11:51 am

≫ Next: PowerShell for finding the size of your local OneDrive folder

≪ Previous: My Top Reasons to Use OneDrive

Perhaps OneDrive

Perhaps OneDrive’s like a place to save
A shelter from the storm
It exists to keep your files
In their clean and tidy form
And in those times of trouble
When your PC is gone
The memory in OneDrive
will bring you home

Perhaps OneDrive is like a window
Perhaps like one full screen
On a watch or on a Surface Hub
Or anywhere in between
And even if you lose your cell
With pictures you must keep
The memory in OneDrive
will stop your weep.

OneDrive to some is like a cloud
To some as strong as steel
For some a way of sharing
For some a way to view
And some use it on Windows 10
Some Android, some iPhone
Some browse it on a friend’s PC
When away from their own

Perhaps OneDrive is like a workbench
Full of projects, full of plans
Like the draft of a great novel
your first rocket as it lands
If I should live forever
And all my dreams prevail
The memory in OneDrive
will tell my tale

↧

PowerShell for finding the size of your local OneDrive folder

February 23, 2016, 9:38 pm

≫ Next: The ABC language, thirty years later…

≪ Previous: Perhaps OneDrive

I would just like to share a couple of PowerShell scripts to find the size of your local OneDrive folder. Note that this just looks at folders structures and does not interact with the OneDrive sync client or the OneDrive service.

First, a one-liner to show the total files, bytes and GBs under the local OneDrive folder (typically C:\Users\Username\OneDrive):

$F=0;$B=0;$N=(Type Env:\UserProfile)+"\OneDrive";Dir $N -R -Fo|%{$F++;$B+=$_.Length};$G=$B/1GB;"$F Files, $B Bytes, $G GB" #PS OneDrive Size

Second, a slightly longer script that shows files, folders, bytes and GBs for all folders under the profile folder that starts with “One”. That typically includes both your regular OneDrive folder and any OneDrive for Business folders:

$OneDrives = (Get-Content Env:\USERPROFILE)+"\One*"
Dir $OneDrives | % {
   $Files=0
   $Bytes=0
   $OneDrive = $_
   Dir $OneDrive -Recurse -File -Force | % {
       $Files++
       $Bytes += $_.Length
   }
   $Folders = (Dir $OneDrive -Recurse -Directory -Force).Count
   $GB = [System.Math]::Round($Bytes/1GB,2)
   Write-Host "Folder ‘$OneDrive’ has $Folders folders, $Files files, $Bytes bytes ($GB GB)"
}

Here is a sample output of the code above:

Folder ‘C:\Users\jose\OneDrive’ has 4239 folders, 33967 files, 37912177448 bytes (35.31 GB)
Folder ‘C:\Users\jose\OneDrive-Microsoft’ has 144 folders, 974 files, 5773863320 bytes (5.38 GB)

↧

The ABC language, thirty years later…

March 17, 2016, 10:44 pm

≫ Next: Splitting logs with PowerShell

≪ Previous: PowerShell for finding the size of your local OneDrive folder

Back in March 1986, I was in my second year of college (Data Processing at the Universidade Federal do Ceara in Brazil). I was also teaching programming night classes at a Brazilian technical school. On that year, I created a language called ABC, complete with a little compiler. It compiled the ABC code into pseudo code and ran it right away.

I actually used this language for a few years to teach an introductory programming class. Both the commands of the ABC language and the messages of the compiler were written in Portuguese. This made it easier for my Brazilian students to start in computer programming without having to know any English. Once they were familiar with the basic principles, they would start using conventional languages like Basic and Pascal.

The students would write some ABC code using a text editor and run the command “ABC filename” to compile and immediately run the code if no errors were found. The tool wrote a binary log entry for every attempt to compile/run a program with the name of the file, the error that stopped the compilation or how many instructions were executed. The teachers had a tool to read this binary log and examine the progress of a student over time.

I remember having a lot of fun with this project. The language was very simple and each command would have up to two parameters, followed by a semicolon. There were dozens of commands including:

Inicio (start, no action)
Fim (end, no action)
* (comment, no action)
Mova (move, move register to another register)
Troque (swap, swap contents of two registers)
Salve (save, put data into a register)
Restore (restore, restore data from a register)
Entre (enter, receive input from the keyboard)
Escreva (write, write to the printer)
Escreva> (writeline, write to the printer and jump to the next line)
Salte (jump, jump to the next printed page)
Mostre (display, display on the screen)
Mostre> (displayline, display on the screen and jump to the next line)
Apague (erase, erase the screen)
Cursor (cursor, position the cursor at the specified screen coordinates)
Pausa (pause, pause for the specified seconds)
Bip (beep, make a beeping sound)
Pare (stop, stop executing the program)
Desvie (goto, jump to the specified line number)
Se (if, start a conditional block)
FimSe (endif, end a conditional block)
Enquanto (while, start a loop until a condition is met)
FimEnq (endwhile, end of while loop)
Chame (call, call a subroutine)
Retorne (return, return from a subroutine)
Repita (repeat, start a loop that repeats a number of times)
FimRep (endrepeat, end of repeat loop)
AbraSai (openwrite, open file for writing)
AbraEnt (openread, open file for reading)
Feche (close, close file)
Leia (read, read from file)
Grave (write, write to file)
Ponha (poke, write to memory address)
Pegue (peek, read from memory address)

The language used 26 pre-defined variables named after each letter. There were also 100 memory positions you could read/write into. I was very proud of how you could use complex expressions with multiple operators, parenthesis, different numeric bases (binary, octal, decimal, hex) and functions like:

Raiz (square root)
Inverso (reverse string)
Caractere (convert number into ASCII character)
Codigo (convert ASCII character into a number)
FimArq (end of file)
Qualquer (random number generator)
Tamanho (length of a string)
Primeiro (first character of a string)
Restante (all but the first character of a string)

I had a whole lot of samples written in ABC, showcasing each of the command, but I somehow lost them along the way. I also had a booklet that we used in the programming classes, with a series of concept followed by examples in ABC. I also could not find it. Oh, well…

At least the source code survived (see below). I used an old version of Microsoft Basic running on a CP/M 2.2 operating system on a TRS-80 clone. Here are a few comments for those not familiar with that 1980’s language:

Line numbers were required. Colons were used to separate multiple commands in a single line.
Variables ending in $ were of type string. Variable with no suffix were of type integer.
Your variable names could be any length, but only the first 4 characters were actually used. Periods were allowed in variable names.
DIM was used to create arrays. Array dimensions were predefined and fixed. There wasn’t a lot of memory.
READ command was used to read from DATA lines. RESTORE would set the next DATA line to READ.
Files could be OPEN for sequential read (“I” mode), sequential write (“O” mode) or random access (“R” mode).

It compiled into a single ABC.COM file (that was the executable extension then). It also used the ABC.OVR file, which contained the error message and up to 128 compilation log entries. Comments are in Portuguese, but I bet you can understand most of it. The code is a little messy, but keep in mind this was written 30 years ago…

2 '************************************************************
3 '*   COMPILADOR/EXECUTOR DE LINGUAGEM ABC - MARCO/1986      *
4 '*               Jose Barreto de Araujo Junior              *
5 '*     com calculo recursivo de expressoes aritmeticas      *
6 '************************************************************
10 ' Versao  2.0 em 20/07/86
11 ' Revisao 2.1 em 31/07/86
12 ' Revisao 2.2 em 05/08/86
13 ' Revisao 2.3 em 15/02/87
14 ' Revisao 2.4 em 07/06/87, em MSDOS
20 '********** DEFINICOES INICIAIS
21 DEFINT A-Z:CLS:LOCATE 1,1,1:ON ERROR GOTO 63000
22 C.CST=1:C.REGIST=2:LT$=STRING$(51,45)
25 DIM ENT$(30),RET$(30),TP(30),P1$(30),P2$(30)
30 DIM CMD(200),PR1$(199),PR2$(199)
35 DIM MEM$(99),REGIST$(26),PRM$(4),MSG$(99)
36 DIM CT(40),REP(10),REPC(10),ENQ(10),ENQ$(10),CHA(10)
40 DEF FNS$(X)=MID$(STR$(X),2)
55 OPER$="!&=#><+-*/^~":MAU$=";.[]()?*"
60 FUNC$="RAIZ     INVERSO  CARACTER CODIGO   FIMARQ   QUALQUER "
62 FUNC$=FUNC$+"TAMANHO  PRIMEIRO RESTANTE ARQUIVO  "
65 ESC$=CHR$(27):BIP$=CHR$(7):TABHEX$="FEDCBA9876543210"
66 OK$=CHR$(5)+CHR$(6)+CHR$(11)
70 M.LN=199:M.CMD=37:MAX=16^4/2-1
75 ESP$=" ":BK$=CHR$(8):RN$="R":IN$="I":OU$="O":NL$=""
80 OPEN RN$,1,"ABC2.OVR",32:FIELD 1,32 AS ER$
85 IF LOF(1)=0 THEN CLOSE:KILL"ABC2.OVR":PRINT "ABC2.OVR NAO ENCONTRADO":END
90 GOSUB 10000 '********** MOSTRA MENSAGEM INICIAL
95 PRINT "Nome do programa: ";:BAS=1:GOSUB 18000:AR$=RI$:GOSUB 10205
99 '********** DEFINICAO DOS COMANDOS
100 DIM CMD$(37),PR$(37):CHQ=0:RESTORE 125
105 FOR X=1 TO M.CMD:READ CMD$(X),PR$(X)
110    CHQ=CHQ+ASC(CMD$(X))+VAL(PR$(X))
115 NEXT : IF CHQ<>3402 THEN END
120 '********** TABELA DOS COMANDOS E PARAMETROS
125 DATA INICIO,10,FIM,10,"*",10
130 DATA MOVA,54,TROQUE,55
135 DATA SALVE,30,RESTAURE,30," ",00
140 DATA ENTRE,52,ESCREVA,42,ESCREVA>,42,MOSTRE,42,MOSTRE>,42
145 DATA SALTE,00,APAGUE,00,CURSOR,22,PAUSA,20,BIP,00
150 DATA PARE,00,DESVIE,40,SE,20," ",00,FIMSE,00
155 DATA ENQUANTO,20," ",00,FIMENQ,00,CHAME,20,RETORNE,00
160 DATA REPITA,20,FIMREP,00
165 DATA ABRASAI,30,ABRAENT,30,FECHE,00,LEIA,50,GRAVE,40
170 DATA PONHA,42,PEGUE,52
190 '********** ABRE ARQUIVO PROGRAMA
200 IF LEN(ARQ$)=0 THEN ERROR 99:GOTO 64000
210 OPEN RN$,2,ARQ$:ULT=LOF(2):CLOSE#2
220 IF ULT=0 THEN KILL ARQ$:ERROR 109:GOTO 64000
390 '********** COMPILACAO
400 N.ERR=0:N.LN=0:IDT=0:CT.SE=0:CT.REP=0:CT.ENQ=0:I.CT=0:LN.ANT=0:CMP=1
405 PRINT:PRINT:PRINT "Compilando ";ARQ$
406 IF DEPUR THEN PRINT "Depuracao"
407 PRINT
410 OPEN IN$,2,ARQ$
415 WHILE NOT EOF(2)
420     LN.ERR=0:LINE INPUT#2,LN$
422     IF INKEY$=ESC$ THEN PRINT "*** Interrompido":GOTO 64000
425     N.LN=N.LN+1:GOSUB 20000 '*ANALISE SINTATICA DA LINHA
430 WEND:CLOSE#2
435 FOR X=IDT TO 1 STEP -1
440     ERROR CT(X)+115
445 NEXT X
450 PRINT:PRINT FNS$(N.LN);" linha(s) compilada(s)"
490 '********** EXECUCAO
500 IF N.ERR THEN PRINT FNS$(N.ERR);" erro(s)":GOTO 64000
510 PRINT "0 erros"
515 PRINT "Executando ";ARQ$:PRINT
520 NL=1:CMP=0:N.CMD=0:CHA=0:ENQ=0:REP=0:SE=0:ESC=0
525 FOR X=1 TO 99:MEM$(X)="":NEXT:FOR X=1 TO 26:REGIST$(X)="":NEXT
530 WHILE NL<=M.LN
535     PNL=NL+1:CMD=CMD(NL):PR1$=PR1$(NL):PR2$=PR2$(NL)
540     IF CMD>3 THEN GOSUB 30000:N.CMD=N.CMD+1 '****** EXECUTA COMANDO
550     NL=PNL:REGIST$(26)=INKEY$
555     IF REGIST$(26)=ESC$ OR ESC=1 THEN NL=M.LN+1:PRINT "*** Interrompido"
560 WEND
570 PRINT:PRINT ARQ$;" executado"
580 PRINT FNS$(N.CMD);" comando(s) executado(s)"
590 PRINT:PRINT "Executar novamente? ";
600 A$=INPUT$(1):IF A$="S" OR A$="s" THEN PRINT "sim":GOTO 515
610 PRINT "nao";:GOTO 64000
9999 '********** ROTINA DE MENSAGEM INICIAL
10000 CLS:PRINT LT$
10020 XA$="| COMPILADOR/EXECUTOR DE LINGUAGEM ABC VERSAO 2.4 |"
10030 PRINT XA$:PRINT LT$:PRINT
10040 CHQ=0:FOR X=1 TO LEN(XA$):CHQ=CHQ+ASC(MID$(XA$,X,1)):NEXT
10050 IF CHQ<>3500 THEN END ELSE RETURN
10199 '********** ROTINA PARA PEGAR NOME DO ARQUIVO
10200 AR$=NL$:K=PEEK(128):FOR X=130 TO 128+K:AR$=AR$+CHR$(PEEK(X)):NEXT
10205 IF AR$="" THEN ERROR 99:GOTO 64000
10210 AR$=AR$+ESP$:PS=INSTR(AR$,ESP$)
10220 ARQ$=LEFT$(AR$,PS-1):RESTO$=MID$(AR$,PS+1)
10221 IF LEFT$(RESTO$,1)="?" THEN DEPUR=1
10230 FOR X=1 TO LEN(MAU$):P$=MID$(MAU$,X,1)
10240   IF INSTR(ARQ$,P$) THEN ERROR 100:GOTO 64000
10250 NEXT
10270 IF LEN(ARQ$)>12 THEN ERROR 100:GOTO 64000
10280 IF INSTR(ARQ$,".")=0 THEN ARQ$=ARQ$+".ABC"
10290 RETURN
17999 '********** ROTINA DE ENTRADA DE DADOS
18000 BAS$=FNS$(BAS):RI$=NL$
18010 A$=INPUT$(1)
18020 WHILE LEN(RI$)<255 AND A$<>CHR$(13) AND A$<>ESC$
18030    RET$=RI$
18040    IF A$=BK$ AND RI$<>NL$ THEN RI$=LEFT$(RI$,LEN(RI$)-1):PRINT ESC$;"[D ";ESC$;"[D";
18050    IF BAS=1 AND A$>=ESP$ THEN RI$=RI$+A$:PRINT A$;
18070    IF BAS>1 AND INSTR(17-BAS,TABHEX$,A$) THEN RI$=RI$+A$:PRINT A$;
18090    A$=INPUT$(1)
18100 WEND
18105 IF A$=ESC$ THEN ESC=1
18110 A$=RI$:GOSUB 42030:RI$=RC$:RETURN
18120 RETURN
18499 '********** CONVERTE PARA BASE ESTRANHA
18500 IF BAS=0 THEN BAS=1
18505 IF BAS=1 OR BAS=10 THEN RETURN
18510 A=VAL(A$):A$=""
18520 WHILE A>0:RS=A MOD BAS:A$=MID$(TABHEX$,16-RS,1)+A$:A=A\BAS:WEND
18525 IF A$="" THEN A$="0"
18530 RETURN
18999 '********** EXECUTA PROCURA DE FIMREP,FIMSE,FIMENQ
19000 IDT=0
19010 WHILE (CMD(PNL)<>FIM OR IDT>0) AND PNL<100
19020    IF CMD(PNL)=INI THEN IDT=IDT+1
19030    IF CMD(PNL)=FIM THEN IDT=IDT-1
19040    PNL=PNL+1
19050 WEND:PNL=PNL+1
19060 RETURN
19500 FOR X=1 TO LEN(UP$)
19510     PP$=MID$(UP$,X,1)
19520     IF PP$>="a" AND PP$<="z" THEN MID$(UP$,X,1)=CHR$(ASC(PP$)-32)
19530 NEXT X:RETURN
19600 N.PRM=N.PRM+1:PRM$(N.PRM)=LEFT$(A$,C-1):A$=MID$(A$,C+1)
19610 C=1:WHILE MID$(A$,C,1)=ESP$:C=C+1:WEND:A$=MID$(A$,C):C=0
19620 IF LEN(PRM$(N.PRM))=1 THEN PRM$(N.PRM)=CHR$(ASC(PRM$(N.PRM))+(PRM$(N.PRM)>"Z")*32)
19630 RETURN
19990 '********** ANALISE SINTATICA DA LINHA
19999 '********** RETIRA BRANCOS FINAIS E INICIAIS
20000 N.PRM=0:A$=LN$:PRM$(1)=NL$:PRM$(2)=NL$
20010 C=1:WHILE MID$(A$,C,1)=ESP$:C=C+1:WEND:A$=MID$(A$,C)
20020 C=LEN(A$)
20040 WHILE MID$(A$,C,1)=ESP$ AND C>0:C=C-1:WEND
20050 A$=LEFT$(A$,C):LN$=A$
20100 '********** ISOLA O NUMERO DA LINHA
20105 C=INSTR(A$,ESP$):NUM$=LEFT$(A$,C):A$=MID$(A$,C+1)
20110 C=1:WHILE MID$(A$,C,1)=ESP$:C=C+1:WEND:A$=MID$(A$,C)
20115 IF NUM$="" AND A$="" THEN RETURN
20120 PRINT NUM$;TAB(5+IDT*3);A$
20130 NL=VAL(NUM$):IF NL<1 OR NL>M.LN THEN ERROR 111:RETURN
20135 IF NL<=LN.ANT THEN ERROR 122:RETURN ELSE LN.ANT=NL
20140 IF MID$(A$,LEN(A$))<>";" THEN PRINT TAB(5+IDT*3);"*** ponto e virgula assumido aqui":A$=A$+";"
20200 '********** ISOLA COMANDO
20210 C=1:P=ASC(MID$(A$,C,1))
20220 WHILE P>59 OR P=42:C=C+1:P=ASC(MID$(A$,C,1)):WEND
20230 CMD$=LEFT$(A$,C-1):A$=MID$(A$,C):A$=LEFT$(A$,LEN(A$)-1)
20240 C=1:WHILE MID$(A$,C,1)=ESP$:C=C+1:WEND:A$=MID$(A$,C)
20300 '********** ISOLA PARAMETROS
20310 IF INSTR(A$,CHR$(34)) THEN GOSUB 27000
20315 PAR=0:C=1
20320 WHILE C<=LEN(A$) AND NPRM<4
20340    P$=MID$(A$,C,1)
20350    IF P$="(" THEN PAR=PAR+1
20360    IF P$=")" THEN PAR=PAR-1
20380    IF P$=ESP$ AND PAR=0 THEN GOSUB 19600
20390    C=C+1
20400 WEND
20410 IF A$<>NL$ THEN N.PRM=N.PRM+1:PRM$(N.PRM)=A$
20420 IF N.PRM>2 THEN ERROR 112:RETURN
20430 PR1$=PRM$(1):PR2$=PRM$(2)
20990 '********** IDENTIFICA COMANDO, 99=ERRO
21000 C.CMD=99:UP$=CMD$:GOSUB 19500:CMD$=UP$
21010 FOR X=1 TO M.CMD
21020   IF CMD$=CMD$(X) THEN C.CMD=X
21030 NEXT X
21040 IF C.CMD=99 THEN ERROR 114:RETURN
21050 CMD(NL)=C.CMD:PR1$(NL)=PR1$:PR2$(NL)=PR2$
21060 '********** ANALISE DE COMANDOS PARENTESIS
21100 C=C.CMD
21110 INI=-(C=21)-2*(C=24)-3*(C=29)
21120 FIM=-(C=23)-2*(C=26)-3*(C=30)
21130 IF INI THEN IDT=IDT+1:CT(IDT)=INI
21140 IF FIM THEN GOSUB 26000:IF LN.ERR THEN RETURN
21990 '********** IDENTIFICA PARAMETROS
22000 PR1=VAL(LEFT$(PR$(C.CMD),1)):PR2=VAL(RIGHT$(PR$(C.CMD),1))
22010 PR$=PR1$:PR=PR1:GOSUB 25000:IF LN.ERR THEN RETURN
22020 TIP.ANT=TIP2:PR$=PR2$:PR=PR2:GOSUB 25000
22025 IF PR1+PR2>7 AND TIP2<>TIP.ANT THEN ERROR 110
22030 RETURN
24990 '********** ANALISE DO PARAMETRO
25000 IF PR=0 AND PR$<>NL$ THEN ERROR 112:RETURN
25010 IF PR=1 OR PR=0 THEN RETURN
25020 ENT$(I)=PR$:GOSUB 41000:IF LN.ERR THEN RETURN
25030 TIP1=TP(I)
25040 I=I+1:ENT$(I)=PR$:GOSUB 40000:IF LN.ERR THEN RETURN
25050 TIP2=TP(I+1)
25060 IF PR=4 THEN RETURN
25070 IF PR=2 AND TIP2=1 THEN RETURN
25080 IF PR=3 AND TIP2=-1 THEN RETURN
25090 IF PR=5 AND TIP1=C.REGIST THEN RETURN
25110 ERROR 115:RETURN
25990 '********** ANALISE DE FIMSE,FIMENQ E FIMREP
26000 IF IDT=0 THEN ERROR 115+FIM:RETURN
26010 IF CT(IDT)<>FIM THEN ERROR 118+CT(IDT):IDT=IDT-1:GOTO 26000
26020 IDT=IDT-1:IF IDT<0 THEN IDT=0
26030 RETURN
26999 '********** TROCA "" POR ()1
27000 ASP=0
27010 WHILE INSTR(A$,CHR$(34))
27020     P=INSTR(A$,CHR$(34))
27030     IF ASP=0 THEN MID$(A$,P,1)="(" ELSE A$=LEFT$(A$,P-1)+")1"+MID$(A$,P+1)
27040     ASP=NOT ASP
27050 WEND
27060 RETURN
29999 '********** EXECUTA COMANDO
30000 IF DEPUR THEN PRINT USING "### & & &;";NL;CMD$(CMD);PR1$;PR2$
30005                ON CMD    GOSUB 30100,30200,30300,30400,30500
30010 IF CMD>5  THEN ON CMD-5  GOSUB 30600,30700,30800,30900,31000
30020 IF CMD>10 THEN ON CMD-10 GOSUB 31100,31200,31300,31400,31500
30030 IF CMD>15 THEN ON CMD-15 GOSUB 31600,31700,31800,31900,32000
30040 IF CMD>20 THEN ON CMD-20 GOSUB 32100,32200,32300,32400,32500
30050 IF CMD>25 THEN ON CMD-25 GOSUB 32600,32700,32800,32900,33000
30060 IF CMD>30 THEN ON CMD-30 GOSUB 33100,33200,33300,33400,33500,33600,33700
30080 RETURN
30099  ' COMANDO INICIO
30100 RETURN
30199  ' COMANDO FIM
30200 RETURN
30299  ' COMANDO *
30300 RETURN
30399  ' COMANDO MOVA
30400 I=I+1:ENT$(I)=PR2$:GOSUB 40000
30410 X1=ASC(PR1$):REGIST$(X1-64)=RET$(I+1):RETURN
30499  ' COMANDO TROQUE
30500 X1=ASC(PR1$)-64:X2=ASC(PR2$)-64:SWAP REGIST$(X1),REGIST$(X2):RETURN
30599  ' COMANDO SALVE
30600 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X$=RET$(I+1)
30602 OPEN OU$,3,X$:FOR X=0 TO 99:WRITE#3,MEM$(X):NEXT:CLOSE#3:RETURN
30699  ' COMANDO RESTAURE
30700 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X$=RET$(I+1)
30702 OPEN IN$,3,X$:FOR X=0 TO 99:LINE INPUT#3,MEM$(X):NEXT:CLOSE#3:RETURN
30799  ' COMANDO INDEFINIDO 3
30800 RETURN
30899  ' COMANDO ENTRE
30900 I=I+1:ENT$(I)=PR2$:GOSUB 40000:BAS=VAL(RET$(I+1))
30905 IF BAS=0 THEN IF PR1$>"M" THEN BAS=1 ELSE BAS=10
30910 GOSUB 18000:X1=ASC(PR1$)-64:REGIST$(X1)=RI$:PRINT:RETURN
30999  ' COMANDO ESCREVA
31000 IF PR1$<>"" THEN I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1$=RET$(I+1) ELSE X1$=""
31010 I=I+1:ENT$(I)=PR1$:GOSUB 40000:BAS=VAL(RET$(I+1))
31015 IF BAS>16 THEN ERROR 107
31020 A$=X1$:GOSUB 18500:LPRINT A$;:RETURN
31099  ' COMANDO ESCREVA>
31100 IF PR1$<>"" THEN I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1$=RET$(I+1) ELSE X1$=""
31110 I=I+1:ENT$(I)=PR2$:GOSUB 40000:BAS=VAL(RET$(I+1))
31115 IF BAS>16 THEN ERROR 107
31120 A$=X1$:GOSUB 18500:LPRINT A$:RETURN
31199  ' COMANDO MOSTRE
31200 I=I+1:ENT$(I)=PR2$:GOSUB 40000:X1=VAL(RET$(I+1))
31201 IF X1>16 THEN BAS=X1:ERROR 107
31205 IF PR1$<>"" THEN I=I+1:ENT$(I)=PR1$:GOSUB 40000:A$=RET$(I+1) ELSE A$=""
31210 BAS=X1:GOSUB 18500:PRINT A$;:RETURN
31299  ' COMANDO MOSTRE>
31300 I=I+1:ENT$(I)=PR2$:GOSUB 40000:X1=VAL(RET$(I+1))
31301 IF X1>16 THEN BAS=X1:ERROR 107
31305 IF PR1$<>"" THEN I=I+1:ENT$(I)=PR1$:GOSUB 40000:A$=RET$(I+1) ELSE PR1$=""
31310 BAS=X1:GOSUB 18500:PRINT A$:RETURN
31399  ' COMANDO SALTE
31400 LPRINT CHR$(12);:RETURN
31499  ' COMANDO APAGUE
31500 CLS:RETURN
31599  ' COMANDO CURSOR
31600 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1=VAL(RET$(I+1))
31610 I=I+1:ENT$(I)=PR2$:GOSUB 40000:X2=VAL(RET$(I+1))
31620 LOCATE X1,X2:RETURN
31699  ' COMANDO PAUSA
31700 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1=VAL(RET$(I+1))
31710 FOR X!=1 TO X1*1000:NEXT:RETURN
31799  ' COMANDO BIP
31800 BEEP:RETURN
31899  ' COMANDO PARE
31900 PNL=M.LN+1:RETURN
31999  ' COMANDO DESVIE
32000 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1=VAL(RET$(I+1))
32010 IF X1<1 OR X1>M.LN THEN ERROR 108
32020 PNL=X1:RETURN
32099  ' COMANDO SE
32100 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1=VAL(RET$(I+1))
32110 IF X1=0 THEN INI=21:FIM=23:GOSUB 19000:RETURN
32120 RETURN
32199  ' COMANDO INDEFINIDO 4
32200 RETURN
32299  ' COMANDO FIMSE
32300 RETURN
32399  ' COMANDO ENQUANTO
32400 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1=VAL(RET$(I+1))
32410 IF X1=0 THEN INI=24:FIM=26:GOSUB 19000:RETURN
32420 ENQ=ENQ+1:ENQ$(ENQ)=PR1$:ENQ(ENQ)=PNL:RETURN
32499  ' COMANDO INDEFINIDO 5
32500 RETURN
32599  ' COMANDO FIMENQ
32600 IF ENQ=0 THEN ERROR 120
32605 I=I+1:ENT$(I)=ENQ$(ENQ):GOSUB 40000:X1=VAL(RET$(I+1))
32610 IF X1>0 THEN PNL=ENQ(ENQ):RETURN
32620 ENQ=ENQ-1:RETURN
32699  ' COMANDO CHAME
32700 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1=VAL(RET$(I+1))
32710 IF X1<1 OR X1>M.LN THEN ERROR 108
32720 CHA=CHA+1:CHA(CHA)=PNL:PNL=X1:RETURN
32799  ' COMANDO RETORNE
32800 IF CHA=0 THEN ERROR 109
32810 PNL=CHA(CHA):CHA=CHA-1:RETURN
32899  ' COMANDO REPITA
32900 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X1=VAL(RET$(I+1))
32905 IF X1=0 THEN INI=29:FIM=30:GOSUB 19000:RETURN
32910 REP=REP+1:REPC(REP)=X1:REP(REP)=PNL:RETURN
32999  ' COMANDO FIMREP
33000 IF REP=0 THEN ERROR 118
33010 REPC(REP)=REPC(REP)-1:IF REPC(REP)>0 THEN PNL=REP(REP):RETURN
33020 REP=REP-1:RETURN
33099  ' COMANDO ABRASAI
33100 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X$=RET$(I+1)
33110 OPEN OU$,3,X$:RETURN
33199  ' COMANDO ABRAENT
33200 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X$=RET$(I+1)
33210 OPEN IN$,3,X$:RETURN
33299  ' COMANDO FECHE
33300 CLOSE#3:RETURN
33399  ' COMANDO LEIA
33400 LINE INPUT #3,X$
33410 X1=ASC(PR1$)-64:REGIST$(X1)=X$:RETURN
33499  ' COMANDO GRAVE
33500 I=I+1:ENT$(I)=PR1$:GOSUB 40000:X$=RET$(I+1)
33510 PRINT#3,X$:RETURN
33599  ' COMANDO PONHA
33600 I=I+1:ENT$(I)=PR1$:GOSUB 40000:XXXX$=RET$(I+1)
33610 I=I+1:ENT$(I)=PR2$:GOSUB 40000:X1=VAL(RET$(I+1))
33615 IF X1>99 THEN ERROR 124
33620 MEM$(X1)=XXXX$:RETURN
33699  ' COMANDO PEGUE
33700 X1=ASC(PR1$)-64
33710 I=I+1:ENT$(I)=PR2$:GOSUB 40000:X2=VAL(RET$(I+1))
33720 IF X2>99 THEN ERROR 124
33730 REGIST$(X1)=MEM$(X2):RETURN
39990 '********** AVALIA EXPRESSAO (RECURSIVA)
40000 GOSUB 41000:'**********AVALIA SINTAXE
40010 IF TP(I)=C.CST THEN GOSUB 42000:RET$(I)=RC$:I=I-1:RETURN
40020 IF TP(I)=C.REGIST THEN GOSUB 43000:RET$(I)=RR$:I=I-1:RETURN
40030 IF TP(I)<199   THEN GOSUB 40100:RETURN
40040 IF TP(I)<255   THEN GOSUB 40200:RETURN
40050 ERROR 101
40090 '********** FUNCAO
40100 I=I+1:ENT$(I)=P1$(I-1):GOSUB 40000
40110 P1$(I)=RET$(I+1):GOSUB 45000:RET$(I)=RP$:I=I-1:RETURN
40190 '********** OPERADOR
40200 I=I+1:ENT$(I)=P1$(I-1):GOSUB 40000:P1$(I)=RET$(I+1):TP(I)=TP(I)*TP(I+1)
40220 I=I+1:ENT$(I)=P2$(I-1):GOSUB 40000:P2$(I)=RET$(I+1)
40230 IF SGN(TP(I))<>TP(I+1) THEN ERROR 110
40240 GOSUB 47000:RET$(I)=RP$:I=I-1:RETURN
40990 '********** AVALIA SINTAXE
41000 A$=ENT$(I)
41010 IF LEN(A$)=1 AND VAL(A$)=0 AND A$<>"0" THEN TP(I)=2:ENT$(I)=CHR$(ASC(ENT$(I))+(ENT$(I)>"Z")*32):RETURN
41025 FOR XX=1 TO 6:B$=MID$(OPER$,XX*2-1,2):PAR=0
41030   FOR X=LEN(A$) TO 1 STEP -1:P$=MID$(A$,X,1)
41050     IF P$="(" THEN PAR=PAR+1
41060     IF P$=")" THEN PAR=PAR-1
41080     IF INSTR(B$,P$) AND PAR=0 THEN 41500
41090   NEXT X:IF PAR<>0 THEN ERROR 105
41105 NEXT XX
41110 P$=MID$(A$,1,1):PAR=0
41120 IF P$<>"(" THEN P1$(I)=A$:P2$(I)="10":TP(I)=1:RETURN
41130 FOR X=1 TO LEN(A$):P$=MID$(A$,X,1)
41140   IF P$="(" THEN PAR=PAR+1  
41160   IF P$=")" THEN PAR=PAR-1  
41170   IF P$=")" AND PAR=0 THEN 41200 
41180 NEXT X:ERROR 105
41200 P1$(I)=MID$(A$,2,X-2):P2$(I)=MID$(A$,X+1)
41220 IF VAL(P2$(I))>0 AND VAL(P2$(I))<17 THEN TP(I)=1:RETURN
41230 IF VAL(P2$(I))>16 THEN ERROR 107
41235 IF P2$(I)=NL$ THEN TP(I)=100:RETURN
41250 UP$=P2$(I):GOSUB 19500:FUN$=UP$:X=INSTR(FUNC$,FUN$):IF X=0 THEN ERROR 108
41260 TP(I)=(X-1)\9+1:IF (X MOD 9<>1)AND X>0 THEN ERROR 108
41270 TP(I)=100+TP(I):RETURN
41500 K=INSTR(OPER$,P$):P1$(I)=LEFT$(A$,X-1):P2$(I)=MID$(A$,X+1)
41530 TP(I)=200+K:RETURN
41990 '********** AVALIA CONSTANTE
42000 A$=P1$(I):BAS=VAL(P2$(I))
42030 VALOR=0:DIG=-1:IF BAS=1 THEN RC$=A$:TP(I)=-1:RETURN
42070 FOR X=LEN(A$) TO 1 STEP -1:DIG=DIG+1:P$=MID$(A$,X,1)
42080   IF INSTR(17-BAS,TABHEX$,P$)=0 THEN ERROR 103:GOTO 42120
42100   Y=16-INSTR(TABHEX$,P$):VALOR=VALOR+Y*BAS^DIG
42101   IF VALOR>MAX THEN ERROR 6:GOTO 42120
42110 NEXT X
42120 RC$=FNS$(VALOR):TP(I)=1:RETURN
42990 '********** AVALIA REGISTISTRADOR
43000 X=ASC(ENT$(I)):IF X<65 OR X>90 THEN ERROR 102:RETURN
43010 IF X-64>12 THEN TP(I)=-1 ELSE TP(I)=1
43020 RR$=REGIST$(X-64):RETURN
44990 '********** CALCULA FUNCAO
45000 P1$=P1$(I):P2$=P2$(I):TP=TP(I+1)
45005 ON TP(I)-99 GOSUB 45100,45110,45120,45130,45140,45150,45160,45170,45180,45190,45200
45010 RETURN
45100 RP$=P1$:TP(I)=TP:RETURN
45110 IF TP=-1 THEN ERROR 110
45115 RP$=FNS$(INT(SQR(VAL(P1$)))):TP(I)=1:RETURN
45120 IF TP=1  THEN ERROR 110
45125 RP$=CHR$(15)+P1$+CHR$(14):TP(I)=-1:RETURN
45130 IF TP=-1 THEN ERROR 110
45135 RP$=CHR$(VAL(P1$)):TP(I)=-1:RETURN
45140 IF TP=1  THEN ERROR 110
45145 RP$=FNS$(INT(ASC(P1$))):TP(I)=1:RETURN
45150 TP(I)=1:IF CMP=1 THEN RETURN
45151 IF EOF(3) THEN RP$="1" ELSE RP$="0"
45155 RETURN
45160 IF TP=-1 THEN ERROR 110
45165 RP$=FNS$(INT(RND(1)*VAL(P1$))+1):TP(I)=1:RETURN
45170 IF TP=1  THEN ERROR 110
45175 RP$=FNS$(LEN(P1$)):TP(I)=1:RETURN
45180 IF TP=1 THEN ERROR 110
45185 RP$=LEFT$(P1$,1):TP(I)=-1:RETURN
45190 IF TP=1 THEN ERROR 110
45195 RP$=MID$(P1$,2):TP(I)=-1:RETURN
45200 TP(I)=1:IF CMP=1 THEN RETURN
45203 OPEN RN$,2,P1$:RP$=FNS$(LOF(2)):CLOSE#2
45206 IF VAL(RP$)=0 THEN KILL P1$
45208 RETURN
46990 '********** CALCULA OPERADOR (OPERA)
47000 P1$=P1$(I):P2$=P2$(I):TP=SGN(TP(I)):TP(I)=ABS(TP(I))
47002 ON TP(I)-200 GOSUB 47210,47220,47230,47240,47250,47260
47005 IF TP(I)>206 THEN ON TP(I)-206 GOSUB 47110,47120,47130,47140,47150,47160
47010 IF TP(I)<212 AND VAL(RP$)>MAX THEN ERROR 6
47020 IF TP(I)<212 AND VAL(RP$)<0 THEN ERROR 123
47030 RETURN
47110 IF TP=-1 THEN ERROR 110
47115 RP$=FNS$(VAL(P1$)+VAL(P2$)):TP(I)=1:RETURN
47120 IF TP=-1 THEN ERROR 110
47125 RP$=FNS$(VAL(P1$)-VAL(P2$)):TP(I)=1:RETURN
47130 IF TP=-1 THEN ERROR 110
47135 RP$=FNS$(VAL(P1$)*VAL(P2$)):TP(I)=1:RETURN
47140 IF TP=-1 THEN ERROR 110
47145 RP$=FNS$(VAL(P1$)/VAL(P2$)):TP(I)=1:RETURN
47150 IF TP=-1 THEN ERROR 110
47155 RP$=FNS$(VAL(P1$)^VAL(P2$)):TP(I)=1:RETURN
47160 IF TP= 1 THEN ERROR 110
47165 RP$=P1$+P2$:TP(I)=-1:RETURN
47210 IF TP=-1 THEN ERROR 110
47215 RP$=FNS$(VAL(P1$) OR VAL(P2$)):TP(I)=1:RETURN
47220 IF TP=-1 THEN ERROR 110
47225 RP$=FNS$(VAL(P1$) AND VAL(P2$)):TP(I)=1:RETURN
47230 RP$=FNS$(P1$=P2$):TP(I)=1:RETURN
47240 RP$=FNS$(P1$<>P2$):TP(I)=1:RETURN
47250 IF TP=-1 THEN RP$=FNS$(P1$>P2$):TP(I)=1:RETURN
47255 RP$=FNS$(VAL(P1$)>VAL(P2$)):TP(I)=1:RETURN
47260 IF TP=-1 THEN RP$=FNS$(P1$<P2$):TP(I)=1:RETURN
47265 RP$=FNS$(VAL(P1$)<VAL(P2$)):TP(I)=1:RETURN
62990 '********** ROTINA DE ERRO
63000 E$=CHR$(ERR):IF INSTR(OK$,E$) AND CMP=1 THEN RESUME NEXT
63009 E=ERR:N.ERR=N.ERR+1:LN.ERR=1:PRINT TAB(5+IDT*3);"*** erro *** ";
63010 IF CMP=0 OR E<100 THEN PRINT "fatal *** ";
63020 GET #1,E:PRINT ER$
63440 IF CMP=1 AND E>99 THEN RESUME NEXT
64000 GET 1,129:ULT=VAL(ER$):IF E>0 THEN GET 1,E ELSE LSET ER$=""
64050 REGIST=(ULT MOD 128)+130
64055 ARQ$=MID$(ARQ$,2):P=INSTR(ARQ$,"."):IF P>0 THEN ARQ$=LEFT$(ARQ$,P-1)
64060 LSET ER$=FNS$(N.ERR)+" "+FNS$(N.CMD)+" "+ARQ$+" "+ER$
64070 PUT 1,REGIST
64080 LSET ER$=STR$(ULT+1):PUT 1,129
64090 PRINT:PRINT "Final de execucao"
64100 END

↧

Splitting logs with PowerShell

March 21, 2016, 5:38 pm

≫ Next: Build 2016 videos related to OneDrive

≪ Previous: The ABC language, thirty years later…

I did some work to aggregate some logs from a group of servers for the whole month of February. This took a while, but I ended up with a nice CSV file that I was ready to load into Excel to create some Pivot Tables. See more on Pivot Tables at: Using PowerShell and Excel PivotTables to understand the files on your disk.

However, when I tried to load the CSV file into Excel, I got one of the messages I hate the most: “File not loaded completely”. That means that the file I was loading had more than one million rows, which means it cannot be loaded into a single spreadsheet. Bummer… Looking at the partially loaded file in Excel I figure I had about 80% of everything in the one million rows that did load.

Now I had to split the log file into two files, but I wanted to do it in a way that made sense for my analysis. The first column in the CSV file was actually the date (although the data was not perfectly sorted by date). So it occurred to me that it was simple enough to write a PowerShell script to do the job, instead of trying to reprocess all that data again in two batches.

In the end, since it was all February data and the date was in the mm/dd/yyyy format, I could just split the line by “/” and get the second item. There’s a PowerShell function for that. I also needed to convert that item to an integer, since a string comparison would not work (using the string type, “22” is less than “3”). I also had to add an encoding option to my out-file cmdlet. This preserved the log’ s original format, avoided doubling size of the resulting file and kept Excel happy.

Here is what I used to split the log into two files (one with data up to 02/14/15 and the other with the rest of the month):

Type .\server.csv |
? { [int] $_.Split("/")[1]) -lt 15 } |
Out-File .\server1.csv -Encoding utf8

Type .\server.csv |
? { [int] $_.Split("/")[1]) -ge 15 } |
Out-File .\server2.csv -Encoding utf8

That worked well, but I lost the first line of the log with the column headers. It would be simple enough to edit the files with Notepad (which is surprisingly capable of handling very large log files), but at this point I was trying to find a way to do the whole thing using just PowerShell. The solution was to introduce a line counter variable to add to the filter:

$l=0; type .\server.csv |
? { ($l++ -eq 0) -or ( ([int] $_.Split("/")[1]) -lt 15 ) } |
Out-File .\server1.csv -Encoding utf8

$l=0; type .\server.csv |
? { ($l++ -eq 0) -or ( ([int] $_.Split("/")[1]) -ge 15 ) } |
Out-File .\server2.csv -Encoding utf8

PowerShell was actually quick to process the large CSV file and the resulting files worked fine with Excel. In case you’re wondering, you could easily adapt the filter to use full dates. You would split by the comma separator (instead of “/”) and you would use the datetime type instead of int. I imagine that the more complex data type would probably take a little longer, but I did not measure it. The filter would look like this:

$l=0; type .\server.csv |
? { ($l++ -eq 0) -or ([datetime] $_.Split(",")[0] -gt [datetime] "02/15/2016")  } |
Out-File .\server1.csv -Encoding utf8

Now let me get back to my Pivot Tables…

↧