Thursday, December 10. 2009The Malware Problem (and a solution)
Some of you might have heard about the Malware incident that recently has hit our friends from gnome-look.org. While some of you might chuckle about it because it hit the competition, there really is nothing to chuckle about, because the next target could easily be us. In fact someone might be uploading a Malware component at this very moment. Noone could tell until it's too late.
So, there have been some discussions about possible solutions for this issue. Some have proposed that we add a review process to all of this, so that anything that gets uploaded gets a security check from some KDE developers. That's a neat idea on paper. But only there. This couldn't possibly work out, for two reasons: 1) Manpower - We simply don't have enough of that. 2) Responsibility - Who wants to be responsible for letting Malware slip through your fingers? This can happen to anyone, and it would be pretty embarrassing. I certainly wouldn't want to be responsible for anything. Back when we designed the scripting system for Amarok 2 (QtScript, in-process), Ian Monroe and I realized that there isn't really any way to make it secure on a technical level. Sandboxing, automatic malware detection, flying cars - all this works somehow in theory, but in reality it requires some Bruce Schneier to do it, which we don't have (there is only only one Schneier, I guess). So basically we realized that the system would be unsafe, and that we would have to live with it. Amarok is very vulnerable to Malware scripts, because scripts can access most of Amarok, and Qt, and whatnot. Any Joe Schmoe could hack up a two-liner script that deletes your $HOME. So we accepted that reality, and tried to think of some other methods for making it all safer. What we came up with is this: Mandatory Version ControlBasically our idea was that all Amarok scripts (and the same could help with other download components) would have to be hosted in a public version control system (VCS). This system could for instance be SVN, simply because it's relatively easy to learn, and we could use a central server for it. These are the three advantages that we'd expect from the proposed system: As an addendum, I should say that this system could only work if we enforce it, making it mandatory for all scripts and any kind of program code that is reachable via GHNS. Making this system optional wouldn't solve anything, because then a Malware person could still merrily go ahead with his/her evil intentions. To sum it up, I think that this approach could really help us, and all that's missing is a practical implementation. We would have to work together with the kde-apps.org people (mainly Frank Karlitschek), and the GHNS developers, and then set up an official VCS repository (maybe KDE SVN, maybe something else). I'd be interested in hearing your opinions about this proposal, so please leave a comment if you have an opinion on it Trackbacks
Trackback specific URI for this entry
No Trackbacks
Comments
Display comments as
(Linear | Threaded)
There is always a solution to trust bottom of the software layer. But this is something what Amarok developers can not do, it is up to distributors.
The answer is security enhanced version of the Linux OS called SELinux. With correct configs none of untrusted code would be executed on the system. But it is so grazy protection that it makes maintaining the system even harder. So we have other systems to include for that. Like AppArmor (as well grazy) and PolicyKit. With these we can build up a secure system, but again. Only if the user is willing to obey secure protocols. So in the end, we will end up to this situation where the user does everything to get their hands to nice screensaver and compromise the whole software system by that stupid action. Only way to avoid that is to teach users how the computer works, how the operating system protects all the other process, how all softwares on the software system are related to each other and how the base security is build by open source and trusted upstreams. Impossible task for people who just want to see their videos, hear music and use facebook.
This is a really good idea, that can be extended to contents of other GHNS-enabled programs. And if you use KDE SVN, you will be able to adapt scripty and get translation infrastructure (almost) for free
That's actually a very good argument that I hadn't thought of initially:
This approach could indeed help solving the current issues with translating scripts. A this point there is no standard for it, people are using different home-brew solutions all over the place. Great idea, thanks
This is a VERY GOOD idea, solves a lot of the issues we have in finding GHNS providers, configuring them, setting up access, etc. And it solves the translation link as well, which is crucial.
I can not find any drawbacks on this proposal, really. Why no one thought about it before?
It would be cool if we had something similar to Plasmate but for all the scripted extensions KDE can handle.
It could make easier to handle commits to the repository and to create any kind of script in general (there could be some templates for differents kind of scripts and the user could build upon that) btw, why SVN and not Git (I don't really care, I just ask because Git seems to have a lot of friends these days)
If someone commits to a central repository, you know the name of the user account that committed. This does not necessarily mean you know which real person is behind that user account.
You could set up a validation process, where people have to show some kind of ID to a KDE developer before they get an account, but that would be a big hurdle for new contributors. You could use an existing system like CAcert instead, but still it would not be easy for new contributors to join. Alternatively, you could let a new contributor provide some patches first and if those seem OK give him/her a user account. You still run the risk that someone first provides good code and will inject malware later, but most likely few malware authors will go after easier targets instead. This is in effect a review system though, but only for new contributors. This means that there is still someone responsible for making the decision to trust the new contributor. It also means that if you have too little review capacity, you will have to limit the number of new contributors you can accept. I do think that mandatory version control is a good idea, but it does not allow you to both prevent malware and still accept a lot of user contributions. In my opinion you are a bit quick to dismiss sandboxing. An Amarok script has no need to access my home directory, so why is it allowed to do so? Why not launch it in a chrooted process, for example. If Chrome can do this for browser tabs, surely it is possible to contain Amarok scripts in the same way. Not easy, perhaps, but possible.
Hmm, why not something like gitorious? This could certainly help build a web of trust, where authors can easily propose merging their scripts into the main account and could get some nice peer review before it's accepted. And the number of contributors to the main repo is AFAIK limitless, so if someone made some contributions he could get access to the master repo...
Personally I love Git, so I'm all for it. But let's be honest here, the learning curve is a bit steeper than with SVN.
Thomas Zander is currently working on an alternative interface for Git (called "VNG") that could help with it. We'll see
Well, if you don't use branches then it's pretty easy for a newcomer IMHO, just the additional git push is required (sometimes it's hard to explain the concept of VCS to new users :/)
Anyway, I get your point because I'm not a typical user and my opinions may be terribly skewed...
I agree that we can't expect everyone to be power users of git, but you don't have to be a power user to use it to the same extent as svn. The core development workflow is very similar between git and svn, and if a developer cannot be expected to understand the push, fetch, and rebase commands, then it isn't going to be any easier for them to resolve conflicts in svn.
The myth that git is "hard" compared to svn is self-perpetuating, because comments to that effect discourage developers from trying it out and discovering that it's actually not that complicated. Instead, we are encouraging new developers to learn an obsolete workflow, and I think that is a bigger problem than the unfamiliar tool problem. What if we were all still using CVS today because of the same attitudes about Subversion?
"Why not git?" Because that would raise the bar way too much. Read the part where it says "That's because learning something like SVN is pretty simple"; git ain't such
runs away from flamewar
I was really convinced of this, for a long time, because Git installed so many damn binaries and there was so little good documentation available on the Internet about Git a few years ago.
You know how many of them I really use? Maybe 10. And they're no harder than they were in Subversion.
What about mercurial (hg), DVCS, but command-set is similar to SVN.
SVN is better if you want partial checkouts and a directory based system. Which we probably would for this. ACL is easier for SVN as well (we'd want to limit who can tag their project likely).
Granted Mark's point isn't really about which VCS to use. Any would do.
What about: making Amarok so good that people don't even want to write plugins, because the default works really really well for them?
You still have stuff that can't be provided "built in" in Amarok for various reasons (like lyricwiki which tends to update too often, or shoutcast with the legal mess).
Because a plugin architecture allows me to only have to deal with the functionality I want.
Why should I be faced with a billion options for various web services, remote music storage and visualization when all I want is a spreadsheet that manages my local music collection and Smart Playlists to organise the playlists dynamically through some quick labelling (tagging)? In fact, if the (presentation portion of the) playlist itself were pluggable, those who for some strange reason like the new design could keep it, while those who just want to manage their music easily on a nice spreadsheet, column headers and all can also do so.
I will assume we are talking about binaries that come with source code (though source can be dubious, I know.. like installing something else from the web after plugin installation).
I think the idea of signing off is a great idea (not an impractical one) if you follow some guidelines. Allow the signer to specify how well they looked at the plugin. This relieves pressure on the auditor but provides something better than nothing. Eg, I looked over the patch very quickly and didn't understand all of it. I give myself 2 out of 5 stars for "audit effort". No warranties implied.... The signoff can have a comments section for any peculiarities the auditor finds. This can be used to state that you didn't look closely at some section or to explain how some particularly tricky section works. Such comments help simplify the work of auditors that come after you. In fact, depending on the mood of the auditor, s/he might prefer to tackle a virgin piece of code to make headway or else jump in at something that has been predigested. [As an aside: The comments section also naturally leads to documented code as concerns safety; however, this peer review process can be extended to allow all sorts of comments, including comments aimed at explaining to a noob of the language (or of the project) how the code works. Someone wanting to write up a more formal document can leverage these notes and save time (or not need coding skills). Thus, leaving comments behind can be encouraged and can be at different levels (eg terse or eg directed at nondevs). Really, we want comments attached to code without cluttering the code (at least not cluttered in the typical view). Tying this together can be done in a number of ways (line references, etc, that can be unified in an editor automatically). Also, there would be no need to pick the best comments. All comments would be accessible and some of these would be apt for the particular audience while the others would not. The reader picks (comments would have unique id), perhaps leveraging someone else's choices of which comments are useful for an audience X.] In addition to comments and to a check-off level of safety (eg, 1-5 stars), the system can also have tagging with various "warning" tags to indicate the software might be trouble (eg, depends on external sources) or has various properties. Such a signoff system should be open to all for various reasons. Those desiring to become contributors with extra access can gain a reputation by signing things off. Also, the idea is to allow unlimited auditors/signers per piece of code. Surely, if you care about a plugin and know how to code, you may volunteer to become say the 29th signer. This way the more popular plugins get many signers, surely some of whom will use a fine comb brush. *There is safety in numbers (in terms of liability/reputation of each auditor) and in catching possible problems*. Finally, such a system will attract developers to the project (amarok, in this case I suppose). In short, a signoff system with certain features can be used to strengthen the developer community, with low negative risks to the auditor's reputation, while providing help to users. [Disclaimer, this is the first time to this blog. I assume we are talking about auditing plugins or code additions of some sort (to amarok, in this case).]
>> I will assume we are talking about binaries that come with source code (though source can be dubious, I know.. like installing something else from the web after plugin installation).
Woops, I forgot to remove this. I'm not sure what the precise problem is, but I think it's about source code auditing, right? >> In addition to comments and to a check-off level of safety (eg, 1-5 stars), the system can also have tagging with various "warning" tags Woops, let me clarify this. [It's late at night,sorry.] The auditor can state how well they looked at the code: 1-5. This lets others know how much faith to put into the result by this auditor. This is the way for the auditor to wash his/her hands of responsibilities. "I was [sleep walking/ in a meeting/ etc] when I looked the code over." The auditor can leave comments. The auditor can add tags. The auditor can give a passing/failing (binary) grade if what we want is "safe" or "not safe". Of course, other gradings can be given. The totality of these items define any given audit would help guide how much trust should be placed on the software. [You may also want a ranking by developers of auditors. This ranking can be kept simple or have various important points noted; *however*, this is unnecessary in most cases and can even be compiled by some interested third party or other (to prevent wars and problems). In fact, perhaps there should be no ranking, although you definitely might want to provide a page for each author pointing to their past audits.] >> There is safety in numbers (in terms of liability/reputation of each auditor) and in catching possible problems Just wanted to highlight this, as it is an important point. It's the second way the auditor can wash his/her hands. For example, if s/he is the 4th auditor, it's fairly safe to agree with the other 3. Alternatively, s/he has a chance to stand out a bit by finding something the others missed (who, naturally, were taking a shower while they made the audit: 2 out of 5 stars).
Would a modified bugzilla (or other issue tracking software) work?
For example, you could automate to issue an "audit bug" for every plugin version uploaded. Audit comments can be done by anyone. Those wishing to further add a score, a pass/fail, a specific comment, an indication they were only using 3 out of every 5 brain cells, etc, can go ahead and include in their comment the code established for such a score, etc.
A cron script (or better: a triggered script) then goes through every new audit bug comment since the last check. It grabs the essential items from the comment (author, score, comment, etc, if any) and updates a website. The website has a listing of all plugins that are on the website with links to a profile page listing the audit cumulative results and anything else that is associated with a particular plugin+version, on a per plugin version basis. There can be numerous cross-references pulled from this data, of course (eg, to get a listing of the "safe" plugins, of those with at least 5 (partial) audits, of those audited by a specific auditor/developer, etc). In essence, new official audit (or other purpose) codes can be added whenever, and any background script can harvest this data from the comments on the "audit bug" (though not strictly of "audit" nature, perhaps) for the particular plugin+version. Eg, you can add a documentation tag or score related to the plugin or add a tag related to the style of coding within the plugin source. You can add tags, scores, etc of all sorts. You can have a code for providing a link to a video/tutorial about the plugin. You can even add a code for a link to 3rd party patches/diffs that do (or not) have an official entry (of course, these patches should have an official entry if they are popular, so that they can be "audited" fully themselves). The open source world is ultimately going to standardize on much metadata, as this will improve automation and integration across projects in ways custom to each user's preferences. It will also facilitate a world where fork/diff (git style perhaps) is the norm. One obvious requirement will be a unique naming scheme for every version of any piece of software or mix of software. Eg, a piece of code can have multiple "names" (not sure which would be the canonical one) essentially comprising an identifier for each of a series of patches that define the software. Each patch would be identified based on the contributor or origin and using something like a UUID (or domain name scheme, etc). As this "world" develops, the real value of source code will become clear (think of gentoo-like distros but more advanced and controlled by goals tied to metadata commands on a per user basis) with most installations of Linux coming with all the source (hard drive space will be very cheap.. so as to have all of the source and many pre-compilations in the "cache"). It will be normal to "contribute" in some way to some part of the software that you run. There will be tools to make all sorts of source/doc/audit/tutorial/etc contributions easy to do (ample documentation and tutorials and automation will exist for these functions) and to publish the contributions easily. ["source" can refer to artwork patch, midi score patch, etc]. And surely security will be very important with all of that source code sharing. Users can set levels of trust and the system will sort everything out based in part on signed "audits". Think of code components being passed around but not as binaries but a metadata files pointing to the patches, etc, that define the eventual binaries. [I think http://nepomuk.kde.org/ may grow to help fill these tagging shoes.] There are many more details.... [Above parent comment #9 is in response to #8.1. This comment in reply to parent #9 should be on that #8 thread as well.]
In short: No, this wouldn't work.
We have enough issues with Bugzilla already now (maintaining it is a nightmare). The code was written in Bash + TCL + Superglue, or something. Noone wants to touch that.
Another option is a more user-centred approach, or perhaps a combination the contributing-user review system you suggest linked to a user-centred system.
What I mean by a user-centred approach is one that is very easy for any one to use. In script manager a "view script" and "report problem" button next to each script. The "report problem" leads to an explanatory message and options of the kind of issue. This doesn't produce bug reports which need to be reviewed by anyone, just statistics against the script (including specific version). GHNS could then place warnings against scripts with significant numbers of warnings. This could work together with the VCS you suggest. Perhaps I am in a fantasy land re. the technical difficulties, but in theory at least scripts with many warnings could be flagged up to reviewers to pay particular attention to. At that point you might ask whether "bug reports" would be useful but I specifically don't want anything that is going to create any extra work for the reviewer. If a user wants to do more than warn other users about a script then they can head to the amarok script repo themselves.
A user centred system seems to make the most sense. Definitely something that could be explored a lot further.
One solution is to use apparmor in order to prevent applications from tampering other paths than tehy need.
See here:
http://amarok.kde.org/blog/archives/1153-The-Malware-Problem-and-a-solution.html#c7355
I still don't undestand why not to use a sandbox or apparmor.
For example: the only disk access an amarok plugin should be able to do is to save some configuration and some data. It shouldn't be allowed to read/write anything on the disk apart from it's own data, not even others plugins data. In the case of a screensaver it doesn't even need to have access to disk at all. It shouldn't be able to access internet either , it should however be able to read the libraries that are necesary for it to work(just read permission), but nothing else than that. I can see no problem doing it this way except from the implementation the sandbox in amarok/gnome-screensaver itself which shouldn't be impossible One question about the cvs method where would this repository be hosted? what if the amarok project decides to get private or if a meteorite falls on the servers? would scripts not be installable anymore until amarok is recompiled in every client(computer) and updated to a new server which might not even be totally secure. I think malicious code will always get to the computer one way or another but handling it the best way (sandboxing it might be a good idea) is the best option.
This article is not just about Amarok.
The solution is meant for all of KDE's download components (Plasma applets, tools, whatnot). There are numerous KDE apps that use GHNS for downloading add-ons from kde-apps.org.
I wasn't only talking about amarok. anything downloaded from GHNS should only be able to access determined paths.
things downloaded from GHNS could also be sandboxed. Sorry but I still can't see the problem with this solution.
"In September 2007, Novell laid off the AppArmor team."
Begging the question, is it maintained at all? http://news.cnet.com/8301-13580_3-9796140-39.html
This is more a reference to the original gnome-look malware, and I'm no developer, however would it be feasible for developers to ONLY provide source code directly readable on the download page, while a client-side plug-in reads the code and creates the appropriate binary and files on-the-fly and then installs them?
That plug-in would then be the only attack vector if compromised, though it could eventually be managed and maintained by distro folks (which could ensure security and compatibility). Again, that's an end-user's point of view, feel free to point out its flaws!
Do what Free Download manager does.
make a rating system for the things. Users can then look what other users say. And do a review system. Do make it mandatory for an amarok script developer to make an account. This way everything can be swept clean by everybody. And what about a nice distinction between Open Source and closed software? Because Open Source is easy to check if it's malware, closed isn't.
Sandboxing DOES help -- a lot!
Look at the Java sandbox model and apply it to all execution, even scripts. When you first install a screensaver (or anything, of any kind, say even a FireFox extension), the package is signed and states what capabilities it needs. So when you install the screensaver, the system asks you: Hey, is it okay for this screen saver to: Open raw network connections? Delete files with root privileges? The end user should be smart enough to know that a screensaver should not require these privileges. Just say no. The sandbox will not permit operations that the user has not approved. Dangerous operations must be stated in the package manifest and approved by the user at install time. Even better, in any given framework, say the screensaver framework (or FireFox, or Amarok, etc) can state what the expected privileges should be. For example, the screensaver can caution the user that most screensavers should not need to delete files or open network connections. A screen saver that does those things as part of its core function should be able to state why it needs this privilege. Thus sandboxing of almost everything, even bare metal binary executables, would prevent a lot of unexpected behavior (such as DDOS or Spam sending) to the average joe fourpack user.
yes that it's what should be done
yet packages downloaded from trusted repositories could do without a sandbox. this method would be really useful when dealing with scripts, or small binaries that come in GHNS. Each GHNS client should define what paths and what parts of the system they are allowed to touch. If an app requires special privileges it should be put on review until someone checks if it's secure, if client policies are well done there would be almost no need to put scripts on review. To let the user choose is a bit dangerous since they'll mostly just say "Yeah you can break my system but give my screen saver!" if the user is let to choose it should be REALLY clear that it might be dangerous no matter how silly it is.
Even if we used such a sandbox system (which I'm still very doubtful about), our proposal (putting it all in a VCS) would still make sense.
There are numerous advantages to the VCS approach that go beyond the Malware issue, as I had explained in the article, and in comments here.
It's easy for anyone with shell access to the svn server to go back and change
someone else's past commit, inserting their malware at that point without anything pointing to them as the perpetrator. Git's chain checksums prevents this sort of attack (modulo checksum collision issues).
To be fair though, there is nothing in git that prevents someone from spoofing a commit as being from another committer. So if you can sneak your obfuscated malware in past the people who read the commits, you don't have to worry about being banned when it's later noticed; your fall guy takes the blame.
Mark,
My friends and I over at sandboxing.org would love to hear any specific criticisms or general requests that you have of today's sandboxing systems. Please speak up about what you've tried and where you've been left unsatisfied. Regards, Michael
I don't believe that it is possible to trace the source of malware every time. I would like to see that in action and really know that such tracing could be very handful in fighting not only malware but also spyware, viruses and other harmful applications.
How many scripts do you need for a particular Amarok version? Could a few dozen well-designed functions be enough to give almost all the functionality your listeners want? If so, you could still add a new script occasionally, when a new requirement comes up, or an omission gets discovered. These scripts would be a VCS, but a small and very controlled one.
With only a few dozen scripts active, qualified volunteers or staff could look through the code, run tests, and approve valid scripts as developers contribute them. Known developers could keep working on improvements, and publish beta versions with enhancements, which could be fully accepted later. Then just anybody won't need to run any other scripts. So malware practitioners would first have to find an unmet need in Amarok, use it to convince you that a new script is important, develop the script, then get it by your developers. They have easier pickings elsewhere. |
Amarok LinksCalendar
QuicksearchCategoriesSyndicate This BlogBlog Administration |
|||||||||||||||||||||||||||||||||||||||||||||||||

