module downloads

You might have noticed that the new site is not particularly fast. This is because our server is nearly saturating our outbound connection to the Internet. This is what one of the two log analyzers I have running is telling me:

During July, we served up almost 75GB of vmod files. (That’s not the whole month, only from 16 July, the day the new site came online, onwards.) In the first five days of August, we’ve served another 17GB of vmod files. Module files constitute 72% (July) and 80% (August) of “viewed” traffic (“viewed” traffic excludes HTTP requests which result in redirects, 404s, etc.), but the traffic is well-spread over the modules: None of the top eight modules contributing the most traffic for August have been downloaded more than 45 times, for example.

We’re already sending out all files with a major MIME type of “text” gzipped; all of the HTML, CSS, JavaScript together amounts to 1% of total traffic, so even if we optimized it away entirely it would make no appreciable difference. The next largest file type after vmod is vmdx. After that, we have PNGs, at 4.5%. I’ve already ran optipng on all of the PNGs on the site, so I don’t see much more savings to be had there.

What I conclude from this that we need more outgoing bandwidth or we need to divert at least some of the module downloads somewhere else.

Here are some possible solutions I’ve thought of:

  1. Get more outgoing bandwidth where we’re hosted.
  2. Co-locate the server at a place with more outgoing bandwidth.
  3. Set up mirrors for the modules.
  4. Put all the modules on Amazon S3.
  5. Use BitTorrent to distribute modules in addition to direct downloads.

#1 might solve our problem for the present with the minimum amount of disruption. It won’t be free. It might also be only a temporary solution—we might find ourselves saturating a larger connection, too.

#2 might solve our problem also. It definitely won’t be free, and we’d already be dangerously close to hitting the point where you pay for extra bandwidth.

#3 would be highly effective at reducing the traffic to our server, but would require two changes, namely that we’d need a way of showing (and rotating) mirror links in the module library, and that we’d need at least one reliable volunteer to run a mirror. The former is not a big challenge; it’s something I already know how to do, more or less. The latter is more difficult, not because it’s hard to set up a mirror (it’s really trivial, you can do it with rsync) but because it’s something of a commitment to do it. However, even a single mirror would help tremendously, as it would divert 40% of our total traffic. There are a few other considerations here, namely location: In order to be really useful for sharing the traffic load, I think we’d need a mirror in North America or Europe, because that’s where most of the requests originate. Sending half of our module download requests to Australia I think would just make for slow downloads (sorry Brent, Ben)—though having mirrors outside of North American and Europe could help increase download speeds in those places, so might still be useful.

#4 would be really simple and presumably provide good performance. Amazon S3 is a cloud storage service. I looked up the prices; we’d pay about $45/month for the amount of traffic we see right now.

#5, distributing modules by BitTorrent, is potentially quite spiffy, in that it might let us offload almost all of the downloads to someplace else and provide great speed, without any cost to us. We could run a tracker on the main server, and anybody who wanted to could grab everything and seed it; this would be kind of like running a self-organizing collection of mirrors. (E.g., I would probably seed from my backup server at home. I’d shut it off at night, but, unlike with mirrors, that wouldn’t cause any broken links.) What I’m not sure about with this one is whether many of our users would be comfortable using BitTorrent. (There is one client which is a Java applet—we could provide links which use that—but unfortunately it’s not open-source.) If we couldn’t get people to choose to download via BitTorrent, then we wouldn’t see much reduction in traffic.

In all cases, what the user sees for downloading modules would be nearly the way it is now, except that in the case of mirrors and BitTorrent, there would be a choice of links.

Thoughts?

I like #5 myself. But only because it seemingly requires less
responsibility.

  • M.

#5, distributing modules by BitTorrent, is potentially quite spiffy, in

that it might let us offload almost all of the downloads to someplace
else and provide great speed, without any cost to us. We could run a
tracker on the main server, and anybody who wanted to could grab
everything and seed it; this would be kind of like running a
self-organizing collection of mirrors. (E.g., I would probably seed from
my backup server at home. I’d shut it off at night, but, unlike with
mirrors, that wouldn’t cause any broken links.) What I’m not sure about
with this one is whether many of our users would be comfortable using
BitTorrent. (There is one client which is a Java applet—we could
provide links which use that—but unfortunately it’s not open-source.)
If we couldn’t get people to choose to download via BitTorrent, then we
wouldn’t see much reduction in traffic.

In all cases, what the user sees for downloading modules would be nearly
the way it is now, except that in the case of mirrors and BitTorrent,
there would be a choice of links.

Thoughts?

I like #5 myself.� But only because it seemingly requires less responsibility.

- M.


#5, distributing modules by BitTorrent, is potentially quite spiffy, in

that it might let us offload almost all of the downloads to someplace

else and provide great speed, without any cost to us. We could run a

tracker on the main server, and anybody who wanted to could grab

everything and seed it; this would be �kind of like running a

self-organizing collection of mirrors. (E.g., I would probably seed from

my backup server at home. I’d shut it off at night, but, unlike with

mirrors, that wouldn’t cause any broken links.) What I’m not sure about

with this one is whether many of our users would be comfortable using

BitTorrent. (There is one client which is a Java applet—we could

provide links which use that—but unfortunately it’s not open-source.)

If we couldn’t get people to choose to download via BitTorrent, then we

wouldn’t see much reduction in traffic.



In all cases, what the user sees for downloading modules would be nearly

the way it is now, except that in the case of mirrors and BitTorrent,

there would be a choice of links.



Thoughts?

I’d vote for 4 or 5, they seem like good alternatives for fairly minimal
responsibility as Michael pointed out.

My only issue with 5 would be the size of the network. I’m guessing the
number of seeders would be rather small and may see intermittent seeds
dropping off from time to time. So unless a module is widely distributed, it
may go dark now and then. Personally my workstation at home stays on 24/7
for the most part and I would seed from there and could do limited seeding
from my office, but not too much due to bandwidth competition.

-----Original Message-----
From: messages-bounces@vassalengine.org
[mailto:messages-bounces@vassalengine.org] On Behalf Of Michael Kiefte
Sent: Thursday, August 05, 2010 8:26 AM
To: messages@vassalengine.org
Subject: Re: [messages] [Developers] module downloads

I like #5 myself. But only because it seemingly requires less
responsibility.

  • M.

    #5, distributing modules by BitTorrent, is potentially quite spiffy,
    in
    that it might let us offload almost all of the downloads to
    someplace
    else and provide great speed, without any cost to us. We could run a
    tracker on the main server, and anybody who wanted to could grab
    everything and seed it; this would be kind of like running a
    self-organizing collection of mirrors. (E.g., I would probably seed
    from
    my backup server at home. I’d shut it off at night, but, unlike with
    mirrors, that wouldn’t cause any broken links.) What I’m not sure
    about
    with this one is whether many of our users would be comfortable
    using
    BitTorrent. (There is one client which is a Java applet—we could
    provide links which use that—but unfortunately it’s not
    open-source.)
    If we couldn’t get people to choose to download via BitTorrent, then
    we
    wouldn’t see much reduction in traffic.

    In all cases, what the user sees for downloading modules would be
    nearly
    the way it is now, except that in the case of mirrors and
    BitTorrent,
    there would be a choice of links.

    Thoughts?

I’d vote for 4, 3� in that order

4 is the sure thing - it just costs money

3 seems like best free solution ?

Not a fan of 5.�Have a compelling dislike for torrents - always find them slow
due to small number of seeders.�Seeder dependent like Chuck says and we are
small group - not likely to improve things as a result


From: Chuck Parrott chuckparrott@earthlink.net
To: messages@vassalengine.org
Sent: Thu, August 5, 2010 9:35:04 AM
Subject: Re: [messages] [Developers] module downloads

I’d vote for 4 or 5, they seem like good alternatives for fairly minimal
responsibility as Michael pointed out.

My only issue with 5 would be the size of the network. I’m guessing the
number of seeders would be rather small and may see intermittent seeds
dropping off from time to time. So unless a module is widely distributed, it
may go dark now and then. Personally my workstation at home stays on 24/7
for the most part and I would seed from there and could do limited seeding
from my office, but not too much due to bandwidth competition.

-----Original Message-----
From: messages-bounces@vassalengine.org
[mailto:messages-bounces@vassalengine.org] On Behalf Of Michael Kiefte
Sent: Thursday, August 05, 2010 8:26 AM
To: messages@vassalengine.org
Subject: Re: [messages] [Developers] module downloads

I like #5 myself.� But only because it seemingly requires less
responsibility.

  • M.

��� #5, distributing modules by BitTorrent, is potentially quite spiffy,
in
��� that it might let us offload almost all of the downloads to
someplace
��� else and provide great speed, without any cost to us. We could run a
��� tracker on the main server, and anybody who wanted to could grab
��� everything and seed it; this would be� kind of like running a
��� self-organizing collection of mirrors. (E.g., I would probably seed
from
��� my backup server at home. I’d shut it off at night, but, unlike with
��� mirrors, that wouldn’t cause any broken links.) What I’m not sure
about
��� with this one is whether many of our users would be comfortable
using
��� BitTorrent. (There is one client which is a Java applet—we could
��� provide links which use that—but unfortunately it’s not
open-source.)
��� If we couldn’t get people to choose to download via BitTorrent, then
we
��� wouldn’t see much reduction in traffic.
���
��� In all cases, what the user sees for downloading modules would be
nearly
��� the way it is now, except that in the case of mirrors and
BitTorrent,
��� there would be a choice of links.
���
��� Thoughts?
���

Just my $.02… I’ve never been a fan of BitTorrent.

I’m a big believer in bit torrent, but I fear we wouldn’t have enough seeds to make it worthwhile. On any given day, are there many people downloading the same module?

Thus spake fil512:

I’m a big believer in bit torrent, but I fear we wouldn’t have enough
seeds to make it worthwhile. On any given day, are there many people
downloading the same module?

Depends on the module. Some modules get 30+ downloads a day.

If I understand how BitTorrent works, then I think having even one seed
other than the main server would be useful, as that would spread half of
the load to the other seed.


J.

Thus spake fil512:

I’m a big believer in bit torrent, but I fear we wouldn’t have enough
seeds to make it worthwhile. On any given day, are there many people
downloading the same module?

BTW, aren’t you supposed to be in Iceland? :slight_smile:


J.

If all module files are hosted on bittorrent by the VASSAL server, then it could work. Every other user that seeds after download will add to the available bandwidth for others. If module uploaders/authors are required to seed their modules themselves, then I don’t see it succeeding.

With Comcast as my ISP, the BitTorrent option would probably cut me out of downloading VASSAL modules.

Keith

Just my 2 cents worth. Out of all the times I’ve tried to download things using BitTorrent, I’ve only had one successful download. I gave up using it.

BitTorrent sucks.

Thus spake bdgza:

If all module files are hosted on bittorrent by the VASSAL server, then
it could work. Every other user that seeds after download will add to
the available bandwidth for others. If module uploaders/authors are
required to seed their modules themselves, then I don’t see it
succeeding.

The idea was the former—that vassalengine.org would host the tracker,
as well as providing the first seed.


J.

The idea was the former—that vassalengine.org would host the tracker,
as well as providing the first seed.

In my experience, the main reason for BitTorrent working ‘badly’ is that you are trying to download something obscure with no reliable seeds available and few other clients on the network. I think BitTorrent could work very well in a community like Vassal where the main server would always be available as a primary seed for all modules.

However, The problem with bitorrent is that it runs as a separate client. People start it up to download the files they want, it may then seed for a while, but then the computer restarts and the client is not restarted. We would need a concious effort by people to leave their BT clients running. I would be happy to allocate a portion of my upload bandwith to a permanently running BT client, as I am sure quite a few of us would. To work though, the Primary seeders would need to download every new module as it came along so that I am available as a seed, as would everyone else (i.e. we would become mirrors).

What would be Really Neat would be to build a BT client into the Vassal Module Manager so that you could navigate the Module repository and click on the modules you want to download, and down they would come. Pretty much the way the Steam Client works. There are GPL Java BT libraries about, but not LGPL I think.

B.

Just my $.02… I’ve never been a fan of Steam.

Just my $.02… I’ve never been a fan of Steam.

Don’t think you’re not getting off that easy!

Why?

What are the shortcomings?

What would make it better?

Thus spake “Brent Easton”:

What would be Really Neat would be to build a BT client into the Vassal Modul
e Manager so that you could navigate the Module repository and click on the m
odules you want to download, and down they would come. Pretty much the way th
e Steam Client works. There are GPL Java BT libraries about, but not LGPL I t
hink.

I had that very same thought last night. This would be one more reason to
go GPL.


J.

If there is a VASSAL tracker and always the 1 seed available, it is pretty much as it is now, with perhaps more strain from running more software. But with people helping out seeding it can quickly become beneficial, especially for popular modules.

Problem: One thing that is mentioned by swampwallaby is something that you often see with torrents, that every new version is a new share. Very quickly you will have several seeds seeding v1, then some seeding v2, lots seeding v3, and 1 seeding v4, etc.

Is it not possible to use the compiled (GPL) libraries in LGPL software if you don’t modify the source code of the libraries?

Including the download client inside the VASSAL software itself would be wonderful to have if it could be realistically done, and wouldn’t create negative side-effects (crash my game while it is busy doing BT stuff for example), and it would need lots of controls that BT clients have if it is going to do uploading (I wouldn’t want it to use 100% upload all the time on all modules, then I will have to block VASSAL with my firewall, which isn’t convenient). It should remain possible to install/download/share modules yourself, not create a locked-in system where you HAVE to use the build-in BT system. If BT is the path chosen then perhaps using normal separate BT clients first would be a good idea before all the effort is taken to make a build-in engine.

Thus spake bdgza:

Is it not possible to use the compiled (GPL) libraries in LGPL software
if you don’t modify the source code of the libraries?

If you link to a GPLed library, then you need to be able to offer the
whole program under the GPL if you distribute it.


J.

Can anybody recommend a good BitTorrent tracker to set up? I’d like to do some testing, but can’t even get started without setting up a tracker first, and I don’ see an obvious candidate out there. (Fedora seems to have no trackers packaged, other then the obsolete one bundled with the bittorrent package, at least not that I can find.) I’ve looked here,

en.wikipedia.org/wiki/BitTorrent … r_software

but don’t know anything about any of these.