Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ugh. So, instead of tackling something like auto-queuing of copy operations to prevent disk thrashing, they did the 'hard' work of adding a pause button and some silly bling. Incidentally, if that conflict-resolution dialog doesn't provide mouse-over image enlargement/preview, I'm going to instantly hate it.


Yes, from their example ("pausing" one task speeds up the rest, and graphs show speed progress "hills and valleys") it can be concluded that they let copy tasks run in parallel instead of queuing the files. That's practically a guaranteed way to produce more fragmented disk layouts. Well done, current MSFT programmers.

You understand what's going on in MSFT when the blog post ends with the line like "All of this adds up to building a significantly improved copy experience, one that is unified, concise, and clear, and which puts you in control of your experience." A clear example of managerspeak -- adjusted to sound good among other managers not to mean something.You see even from where the battles for "bling" come.


Disk fragmentation has not mattered for the last few years. Most modern hard drives are not storing data in the exact way that they are reporting to the software layer. Hardware manufacturers use all sorts of tricks to make their drives faster, many of these 'break' the original specifications for how these devices are meant to work.


Still, if you write a piece of A a piece of B a piece of C then a piece of A then piece of B ( A B C A B C ) it's fully certain that you wont have on the disk AAA BBB CCC, whereas you can expect most of the file patterns to be AAA BBB CCC if you write from the queue.

In the case of A B C A B C you need a disk seek when reading pieces of the file, in the case of AAA you don't. As long as there are mechanical hard disks (and they still have big advantages for a lot of uses) such things matter a lot. You can seek only 100 times per second! Just to compare, during time waited on one seek on 2 GHz machine you can do at least 20 millions of calculations...


Mechanical disks still exhibit differing beginning to end performance, they are still mostly linear. SSDs are of course completely different.


Out of curiosity: what operating system does automatically queue separately initiated copy operations? Wouldn't it be quite confusing if some copy operation just wouldn't start at all before some other - possibly very time consuming - operation was concluded?


As another poster mentioned, Mac OS does this. It solves the problem you mentioned by combining the file copy progress bars under a single window. Visually, it looks like a queue, with the top progressing fastest, and the ones below progressing very slowly or not at all.


Linux. It doesn't queue individual files,but it does schedule disk blocks to minimize seeks and maximize throughput. Look up "elevator algorithm " and "Linux IO scheduler" for more detail.


Windows does this too, of course. Any modern OS does.

http://download.microsoft.com/download/a/f/7/af7777e5-7dcd-4...


I'd be surprised if it didn't. Then again, I was rather surprised that Windows XP's throughput dropped massively when I started copying more than one file at once, so it seems that it didn't do a good job of it.


sciurus did say modern which XP is most definitely not.


XP is pretty modern in relation to elevator seek. It's been used since the 70's, at least.


This is on such a different level as to be irrelevant. The I/O scheduler will still try to accommodate requests within timeframes of seconds, at most. When copying large files to/from two different areas on a spinning disk simultaneously, this means significant time will be spent flying the heads back and forth between them, regardless of the elevator. So queueing two 8GB movie files to go one after another will be significantly faster than copying them simultaneously.


I'm 90% sure that one of the changes made to the way Lion copies files was this exactly - it now queues operations for maximum throughput. I do wish I could find a link that said this - I apologize for that.


Everyone is assuming that they are doing no improvements in the back-end. This was a discussion about the front-end of copy operations. I like the new interface and think that while it does not seem whoopingly big, it will make copy-pasta more tasty.

Just when I thought I could hate on microsoft, they come out with a nice improvement.

HOWEVER what I'd really like to see is a tool which shows you during boot time what boot ops are not behaving well, and give a very easy and responsive interface for murdering those ops. If its a video driver, fall back to default, some crap that can show me a web browser to troubleshoot with giant warnings that your video driver is dead. If they can get this whole startup taking god knows how long due to one bad application mess, windows will be quite awesome.

And then they need to help developers get the posix tools ported to windows an the most meaningful way out-of-the-box with no special install. Including replacing CMD with Bash and changing their FS to support a good structure like linux (C:, D: can still exist, but make a Sys: which contains things like proc and friends)


> That's practically a guaranteed way to produce more fragmented disk layouts

Not necessarily. You can preallocate space at the destination when the copy starts. File copies are one of those cases when you know up front what the file size will be.

IIRC, there are ways to do this in most file system since the early 90's. I remember HPFS in OS/2 could do that.


It's more complicated than that. Specifically, on NTFS there's that thing called MFT which stores the file information and even the whole file if it's small. You can even get MFT fragmented if you write files from folders A B C D like A1 B1 C1 D1 instead of A1 A2 A3 ... If you copy for example 3 folders with a lot of files (no matter of file sizes) without the queue you made disc reading three times more data and skipping them in the case when you do a simple dir for one folder (before it's cached in memory file cache, of course). Queues are important thing, ignoring them is still bad. Even on SSD -- you reduced the throughput rate even there by not caring.


I get queues are important, but the fragmentation problem should be a non-issue unless the code is criminally naïve.


Programmers should try to solve problems and not try to achieve lofty goals set by management like "improve the copy experience".

When you try to achieve goals set by management you compile all the problems together into one package, nicely designed or not, without actually solving any of the root concerns.

How you really bring results as a programmer is when you divide the issues into the smallest reasonable parts and then find the best way forward for each one of them.

And then later bringing all of the separate lines of work together into one whole symphony of code and experience, so to speak.


That's practically a guaranteed way to produce more fragmented disk layouts. Well done, current MSFT programmers.

Is disk fragmentation a serious problem for you? I use SSDs for most of my drives, and let the Microsoft defragger do its thing once in a while, and leave it at that. Doesn't seem to bother me none.


You're not supposed to defrag an SSD with Microsoft defragger, are you really doing that?


Nope, sorry-- I wrote too quickly. What I meant to say was that I use an SSD on most drives, and run the Microsoft defragger periodically on the remainder.


In the case of Linux, more specifically ext4), you get some benefit from having the blocks of a given file contiguous, as they will end up being one single extent.


Do you have a reference for this info?


http://www.micro-isv.asia/2010/12/never-defragment-an-ssd/

"The key benefit to SSDs is that they have virtually no seek time. Reading adjacent blocks of data is no faster than reading blocks that are spread out over the drive. Fragmentation does not affect SSD drive speed.

(...) SSD drives physically wear out as you write to them. Defragmentation software moves around all the files on your drive. Thus, defragmenting an SSD reduces its life span without giving you any benefits."


I had an old netbook with an SSD. A lot of folk noticed that the performance of the disk got slower over time. Surprisingly, defragging the drive actually improved the performance -- but IIRC it was a side effect of the process, not directly anything to do with fragmentation.

I don't quite remember what the issue was, but there was a good anandtech article about it: http://www.anandtech.com/show/2738/8

Fairly certain modern SSDs are not affected, in any case.


Ahh, do you know why? TRIM support isn't being used.

When you're defragmenting, all you're doing is forcing the drive to write a lot, which itself essentially is a crazy way to fix it without TRIM command, but it works. On the other hand, you're doing a lot more write cycles than what TRIM or garbage collection would normally do.


Article says: (Windows 7) "the OS knows it’s a SSD and turns off features like defrag"

So you claim to see improvement from something turned off by the OS? How?


Err, this was several years ago, and I was relaying what other folk with the same device (ASUS 901) reported, not my personal experience.

(And not only was this was before the release of Windows 7, it actually came with some version of Xandros installed...)


SSDs don't benefit from data being sequentially aligned. Random access is just as fast. Also, you should minimize the write access to the disk whenever possible.


People who are serious about this stuff just use TeraCopy on Windows anyway. Who wants to copy terabytes or even gigabytes of data and not verify it?

People who build systems, in general and excluding the ZFS authors, really need to start worrying about my data's integrity at least as much as I do.


I used to use TeraCopy, but on my Windows 7 64 bit machine, it's half the speed of the built-in copy for network transfers. It's fairly clear that Windows is doing a significantly better job with buffering.

I still keep TC around for the odd job, but it's no longer my default.


Ouch. That's good to know.

Through sheer luck till now I've mostly done network transfers using smb + rsync on my notebook.


Auto-queueing is not an no-brainer. While it is easy to assess which operations will contend with each other on local drives as soon as you start interacting with the network it gets far more fluffy as the storage arrangement is abstracted away from where the local OS can detect it.

Even for local operations automatically queueing operations could be sub-optimal. If copying chunks for small files from two spinning-disk-and-moving-heads drives to an SSD basic queuing algorithms would perform one copy after the other but the SSD is probably more than capable of keeping up with both at the same time. So you are going to need a UI control to override the default queuing.

BTW: If you need auto-queuing and other tweaks in this area for current Windows variants, I've been using the free (not Free) version of http://en.wikipedia.org/wiki/Teracopy for some time and have found it to be useful and reliable.

One thing I would love to see in a conflict resolution dialog is the option to view a diff of text files and similar, like Debian offers to help resolve file conflicts when applying package updates, not just previews of graphics files.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: