ContentslistsavailableatScienceDirect
Computer Networks
journalhomepage:www.elsevier.com/locate/comnet
Mistrustful P2P: Deterministic privacy-preserving P2P file sharing model to hide user content interests in untrusted peer-to-peer networks
Pedro Moreira da Silva
∗, Jaime Dias, Manuel Ricardo
INESC TEC, Faculdade de Engenharia, Universidade do Porto Rua Dr. Roberto Frias, 378, Porto 4200-465, Portugal
a rt i c l e i n f o
Article history:
Received 3 October 2016 Revised 10 March 2017 Accepted 5 April 2017 Available online 6 April 2017 Keywords:
Peer-to-peer Deterministic privacy Content interests Plausible deniability Legal framework
a b s t r a c t
P2Pnetworksendowedindividualswiththemeanstoeasilyandefficientlydistributedigitalmediaover theInternet,butuserlegalliabilityissuesmayberaisedastheyalsofacilitatetheunauthorizeddistribu- tionandreproductionofcopyrightedmaterial.TraditionalP2Pfilesharingsystemsfocusonperformance andscalability,disregardinganyprivacyorlegalissuesthatmayarisefromtheiruse.Lackingalterna- tives,andunawareofthe privacyissuesthatarisefromrelaying trafficofinsecureapplications,users haveadoptedanonymitysystemsforP2Pfilesharing.
Thiswork aimsathidingusercontentinterestsfrommaliciouspeersthroughplausibledeniability.
TheMistrustfulP2PmodelisbuiltontheconceptofmistrustingalltheentitiesparticipatingintheP2P network,henceitsname. Itprovidesadeterministicand configurableprivacyprotectionthat relieson covercontentdownloadstohideusercontentinterests,hasnotrustrequirements,andintroducesseveral mechanismstopreventuserlegalliabilityandreducenetworkoverheadwhileenablingtimelycontent downloads.
WeextendpreviousworkontheMistrustfulP2Pmodelbydiscussingitslegalandethicalframework, assessingitsfeasibilityformoreusecases,providingasecurityanalysis,comparingitagainstatraditional P2Pfilesharingmodel,andfurtherdefiningandimprovingitsmainmechanisms.
© 2017ElsevierB.V.Allrightsreserved.
1. Introduction
In the last two decades, the advances in computer technol- ogyhavedrasticallychangedthemedialandscape.Thewidespread availability of digital media and broadband Internet connections enabled consumersto become also producers anddistributors of creative work, something that inthe past was mostlylimited to professional parties. Peer-to-Peer (P2P) networks endowed indi- viduals with the means to easily and efficiently distribute digi- tal media over the Internet, and are extensively used for large- scale file sharing due to their decentralized andscalable nature.
Nevertheless, asmoreinformation flows throughthese networks, usersarebecomingincreasinglyconcernedabouttheirprivacy.The P2P architecture enables solutions that enhance privacy, such as Freenet[1],Nymble[2],andTor[3],butuserlegalliabilityissues may be raised asit also facilitates unauthorized distribution and reproductionofcopyrightedmaterial[4].
∗ Corresponding author.
E-mail addresses: [email protected] (P. Silva), [email protected] (J. Dias), [email protected] (M. Ricardo).
Thereasonsbehindseekingprivacymaybevarious,suchasto avoiduserprofiling,trackinganddatamining, toprivately down- load legal contents that may be embarrassing or objectionable froma political, religiousorsocial point-of-view,orto download unethical, illegal orincriminating contents withoutbeing subject toanyliability. Thecurrentstate ofcomputer technologyenables organizationsandindividualstocreatedatabasesofpersonalinfor- mation that were previously impossibleto set up,and swapthis information,sellitoruseitinanyotherwayasacommodity[5]: organizationsandindividualsareabletocollectinformationabout users, combine facts from separate sources, and merge them to createsuch databasesofpersonalinformation.Usersmayfeelex- posedorembarrassedifitbecomespublicthattheyhadaccessto contentssuch asto helpdealingwithalcoholanddrug abuse,to help dealing withanger management, orconsidered heretical or profane.Evenaccessingillegalorincriminatingcontentsmayhave anethicalmotivationbecauselawsare,ideally,drawnfromethics, butthat is not always the case:e.g., autocratic regimes consider illegaltoshareorevenaccesscontentsthattheyconsiderinappro- priateorpotentiallyharmful.
Copyright is a legal device that protectsthe expression of an ideabyconferringthecreator,foraperiodoftime,exclusiverights http://dx.doi.org/10.1016/j.comnet.2017.04.005
1389-1286/© 2017 Elsevier B.V. All rights reserved.
to publish,sell and control the reproduction of his work, whom may grant or sell these rights to others. Copyright infringement andpiracyarethenviolationsofoneoftheseexclusiverights[6]. However,theserightsaregranted nationwide,notworldwide:the BerneConvention[7]isnotratifiedby allcountriesandonlysets the minimum standards for copyright. Therefore, given that P2P usersare spread all over the world anda single content sharing maycross manyjurisdictions, it is hard to determine the appli- cablelegal frameworkandtheextentofuserliability. Evenwhen thelegalframeworkcanbeclearlydefined,itmaystillbedifficult todeterminetheextentofuserliabilitysincethecopyrightrights mayconflictwithotherinterestsandrights,beingsubjecttopro- portionality assessment to balance all the rights andinterests at stake[8].
Traditional P2P file sharing systems focus on performance andscalability, disregarding anyprivacy or legal issues that may arise from their use. These systems take advantage of the large number of interconnected peers1, and their idle resources, to moreefficientlydistributecontents at thecost ofrequiringpeers to publicly advertise what they download, making it trivial to identifyusercontentinterests. Thisproblemisfurtheraggravated bythefactthatpeersforminterest-basedcommunities,andevery single connection presents an opportunity for a malicious peer to passively obtain additional information that may enable the identification of user content interests, with high certainty, by monitoringjustasmallfractionofthenetwork[9].
Several privacy-preserving P2P systems have been proposed, but, given that there is a trade-off between privacy and perfor- mance, they consider differentattack models and employ differ- enttechniquesforprivacypreservation. Themajorityofthe solu- tionsemployeithertechniquestoprovideanonymityortechniques toprovideplausibledeniability. Techniquesto provideanonymity, such as onion routing [10] and information slicing [11], are usu- allystronger but havelower performance. Techniques to provide plausibledeniability,suchasrequestrelaying– peersrelayrequests tocreateuncertainty aboutcommunicating endpoints– and con- tentinterestdisguise– peersdownloadadditionalcontentstohide their real interests –, are usually weaker buthave better perfor- mance.Despitetheirdifferences,allP2Pfilesharingsystemsshare one common issue: they require peers to advertise, either fully orpartially,what they download.Lackingalternatives,users have adopted anonymity systems for P2P file sharing [12], misunder- standing the privacy guarantees provided by such systems [13], in particular when relaying traffic of insecure applications [14]. Anonymity systems provide a channel to anonymously transmit messages,buttheuser’sidentity maybedisclosedby thecontent ofthat messages,whicharethesoleresponsibilityoftheapplica- tion.
The Mistrustful P2P model [15] provides plausible deniability through content interest disguise, as in other privacy-preserving solutions,butithasnotrustrequirements,preventsuserlegalli- abilityincaseoflegitimate usage,andensuresdeterministic pro- tectionof usercontent interests against attacksofa size up toa configuredlevel.Itisbuilton thebasis ofmistrustingall theen- titiesparticipatingintheP2Pnetwork,henceitsname,andthere- foreusersarenotrequiredtoestablishtrustlinksinordertopar- ticipateinthecontentsharing.TheMistrustfulP2Pmodelenables eachusertosettherequiredtrade-off betweenprivacyandperfor- mancebyconfiguring,percontent,thesizeofthelargestgroupof colludingpeerstobeprotectedagainst,andtheminimumnetwork overheadofcontentinterest disguise(minimumnetworkdisguise overhead).
1We use the term peer to refer to the network node, and the term user to refer to the person.
Inthispaperweextendtheworkpresentedin[15]asfollows:
we discussthelegal andethical frameworkto pinpointthe main issuesandchallenges,andtoprovidesomeinsightregardinguser liability;furtherassessthefeasibilityoftheMistrustfulP2Pmodel byconsideringasetof84usecases;provideasecurityanalysisfor commonattacks;compareourmodelagainstatraditionalP2P file sharing modelto betterestimate theimpact ofprivacy preserva- tion;furtherdefineittogetclosertoenablerealimplementations, whileimproving itsmain mechanismsto betteradapt tothe P2P filesharingdynamicsandtotheuserprivacyrequirements.
The remainder of this paper is structured as follows.
Section 2 depicts the main legal and ethical challenges at stake along with user liability considerations. Section 3 briefly de- scribes traditional P2P file sharing systems. Section 4 character- izes theproblemwe aim tosolve. The relatedwork ispresented inSection5.Section6describesthelatestversionoftheMistrust- fulP2Pmodel.Sections7and8provide,respectively,thesecurity analysisandtheevaluationofourmodel.The resultsanddiscus- sionarepresentedinSection9.Section10drawsthemainconclu- sions.
2. Legalandethicalframework
Computer ethicsis a field ofstudy that addresses the ethical challengesofcomputertechnologybutseverallinesofthoughtex- ist. Therefore, in this section our aim is to describe the ethical challengesthatweidentifiedbeingrelevanttoourwork.Werefer thereaderto[16]and[17]fora wideand thoroughexplanation of thesechallenges. Forthe legalandprivacydimensionsofP2P file sharing,basedontheintellectualpropertylaw,inparticularonthe copyrightlaw,weattempttohighlightthekeylegalaspectstotake intoconsiderationonWesterncountriesregardingdirectandindi- rectuserliability.
Lawsareformallyadoptedrulesthatmandateorprohibitacer- tainbehavior, createdby themembersofasocietytobalancethe individualrightstoself-determinationagainsttheneedsoftheso- cietyasawhole,and,ideally,aredrawnfromethics,whichdefine whatisconsideredrightorwrong,i.e., thesocially acceptablebe- haviors.Thekeydifferencebetweenlawsandethicsisthatthefor- mercarry theauthorityofagoverning body,usually anation[5]. Inturn,ethicsarebasedonculturalmores,beliefs,valuesandprin- ciples,whichreflecttheuniqueexistentialexperiencesthatweac- cumulateasindividualsaswell associeties and,supportedonin- stitutions, provide long-term stablerules that are madeobvious.
Thus,theserules canbeseenasrefractionsofthecommonworld awareness that give rise to different experiences and interpreta- tions:multiculturalethics[18].
The cultural differences, despite some ethical standardsbeing universal(e.g.,murder andtheft),makeitdifficulttodefine what isethicalornot.Studieshaveshownthat theperspectiveoneth- ical practices ofindividuals regarding the use of computer tech- nology differs withtheirnationality. Asian traditions ofcollective ownership conflict with Western protection of intellectual prop- erty, and many of the ways the former use software is consid- eredsoftwarepiracybythelatter[5].Thesedifferingperspectives are alsoevident on the control overthe Internet content andon the surveillance made by governments: they radically differ be- tweencountries[19].Accordingtoareport[20]fromtheReporters Without Borders, the Great Firewall of China is getting “taller”, theUnitedKingdom isthe“world championofsurveillance”,and
“NSA2symbolizesintelligenceservices’abuses”;ontheotherhand, Finland,Netherlands,andNorwayareatthetopoftheWorldPress FreedomIndex2016[21].
2United States National Security Agency.
The utmost objective of intellectual property law is to pro- mote progressbyencouragingandstimulatinghumanintellectual creativity andbroaddissemination ofits result [22]. Creatorsare granted exclusive rights asan incentiveto continuetheir works, and,atthesametime,thesocietycanbenefitfromtheirbroaddis- semination.However,giventhattheenforcementofsuchexclusive rightsontheprivatesphereofusersmayclashwiththeirprivacy rights,theprivatecopyinglaw,whenpresent,introducesanexcep- tion to these exclusive rights undersome conditions.For private useandforendsthat areneither directlynorindirectlycommer- cial,anaturalperson3isallowedtocopycopyrightedworksonthe condition that the rightholders receive faircompensation. There- fore, anymedia that may be used for the reproduction of copy- righted works by consumers are designated for payment of the private copying levy, which willcompensate rightholdersfor any harm that maybecaused [23,24].The privatecopying law,when present,differsconsiderablybetweencountries[25],reflectingthe lackofconsensus andthecomplexityofthissubject. Moreover,it isnotclearthatprivatecopyingalways causesharm– e.g.,itmay increase thegroupoffansandenablesuserstotrybeforebuying –,andnotallmedia areusedforthereproductionofcopyrighted works.Wereferthereaderto[22,24,26]forfurtherdiscussionon thesubject.
P2Pnetworksconnectusersallovertheworldwithoutconsid- eringeitherinternationalbordersorculture,therebyasinglecon- tentsharingmaycrossmanyjurisdictions.Thesenetworksareusu- allyconnotedwithcopyrightinfringement becauseitisestimated that,overthecourseofamonth,96.3%ofusersofBitTorrentpor- talshavedownloadedatleastoneinfringingcontent[27].Itisnot feasibleto enforcetheexclusiverightsprovidedby copyrightlaw, ifpresent,inevery singlejurisdictionwhileensuringthattheydo notconflictwiththerightsandinterestsofusers.Asso,inan at- tempttomitigatetheimpactofcopyrightinfringement,righthold- ersarenowtryingtoblockwebsitessuchasThePirateBayinstead oftryingtobringenduserstocourt[28]because,ingeneral, itis moreefficient andlessonerous toprove that theownersofsuch websites profit from the copyright infringement. Therefore, such websitesaresubjecttoindirectliabilityastheyinduce,contribute to or fail to prevent direct copyright infringement. Nevertheless, this does not mean that sharing copyrightinfringing contents is legal.Onthecontrary,usersmaybesubjecttocivilandpotentially criminalliability [24].Theextent ofsuch liabilitydependsonthe applicable legalframeworkandontheintent ofsuchacts:a user mayalsobesubjecttoindirectliability.
The multitude and complexity of copyright laws across the globemakeitimpossibletodefineclearboundariesregardinguser liability.Still,itisourbeliefthattheusersofP2Psystemswillnot beheldlegallyliablefor,unknowinglyandunwillingly,download- inganillegalcontent(directliability)orcontributetoitsdownload (indirectliability) ifthemainmotivationtousetheP2P systemis tosharelegitcontents,anditcannotbe proventhattheuserhad access, either inpartor fully,to the contentdata. Theuser does notbenefitdirectlyfromthedownloadofanillegalcontentthathe hasnoaccessto,andisalsounabletodeterminethat thecontent isunexpectedly illegalbeforehaving accesstoitsdata.Therefore, it is plausible that heeither hasbeen mislead into downloading it or hadno intentionin assisting a misbehaving peer infringing copyrightlaw.
Insum, the basicethicalimperativesare thata personshould not, knowingly and willingly, cooperate in or contribute to the wrongdoing of another, and that the human intellectual creativ- ity needs to be encouraged and stimulated in order to promote
3In jurisprudence, a natural person is a human being, as opposed to a legally generated juridical person.
SEEDER LEECHER
Owns
(all chunks) (downloaded chunks)
Shares
(all chunks) (downloaded chunks)
Downloads (does not download)
(missing chunks) Owned data chunk Missing data chunk
Fig. 1. Peer roles and sharing behavior on traditional P2P systems. A seeder has all data chunks and just shares them; a leecher is a peer still downloading missing data chunks and sharing the data chunks it has already downloaded.
progress.The extent of theseincentives depends on the cultural background as the well-being of the society may be incentive enough (collective ownership) or further incentives may be re- quired(intellectual property rights).P2P networksintroduce new challengestothe privatecopying levysystem, andthe legislation onthissubjectisexpectedtochangeinthenearfuturetoaddress them.Users ofP2Psystemsshouldnotbesubjecttoanyciviland criminal liability asa resultof legitimate usage ofthe system if themainmotivation touseit istosharelegit contents, andthey havenoaccesstothecontentdata.Theintentisimportanttode- terminetheextentofuserliability,especiallyifhisactionsdirectly orindirectlycausedprovableharm.
3. TraditionalP2Pfilesharingsystems
TraditionalP2P filesharingsystems focusonperformance and scalability,disregardinganyprivacyandlegalissuesthatmayarise from their use. In this section we provide an overview of these systems,andpresenttheir mainprivacy andlegalissues.We use BitTorrentas an example given that it is the mostprominent of suchsystems.Westartby providingabriefdefinitionofcommon BitTorrent terminology – hash, content, chunk, torrent file, peer, seeder,leecher, swarm, andtracker –, describe its content sharing process,andendpresentingthemainprivacyandlegalissues.
Ahashisanalphanumericstringusedtoidentifyandtoverify the integrity of the data being transferred, usually an SHA-1 di- gest (hexadecimalstring). A contentis composed ofone or more files, identified by the hashof its data,and partitionedinto sev- eralpieces.Achunk isone ofthosedatapieces, andatorrentfile providesthecontent’smetadata.Apeerisanodesharingchunks, which, fora given content, can be either a seeder – a peerthat has all data chunks and is justsharing them – or a leecher – a peerstill downloadingthecontent andsharingthe chunksit has alreadydownloaded. Fig.1 depictsthepeerroles andtheir shar- ingbehavior.Aswarmisthesetofpeerssharingthesamecontent (peers forminterest-basedgroups tosharecontents), andisusu- allyidentifiedthroughthehashofthecontent.Atrackerisacen- tralnode that provides listsofpeers inswarmsby keepingtrack ofwhichpeersare sharingwhich contentsandtheir role (seeder orleecher)oneachcontent.
Traditional P2P systems take advantage of the large number of interconnected peers, and their idle resources, to more effi- ciently distributecontents; their main performance metric is the average download time. E.g., BitTorrentprotocol employs several mechanisms for chunk and peer selection, such as rarest first
A A
Tracker 1. Register
S L A
mrawS eht nioJ
reeP tceleS .3
S
L L
L L
L
4. Synchronize Owned Chunks
5. Transfer Chunk SWARMX
S Seeder L Leecher Swarm Tracker
A Unregistered Peer A A Leecher A A Seeder A
Fig. 2. Overview of the content download process on traditional P2P systems. First, peer A registers at the tracker to join the swarm of content X (swarm X). After registering, the peer starts as a leecher (still downloading) and requests from the tracker a list of peers (a subset) sharing that content. Until download completion, peer A selects peers from that list, synchronizes the chunks it owns with theirs, and transfers those available, if any, that it misses; the list of peers may be updated during content download. Upon download completion, peer A notifies the tracker to be known as a seeder.
mechanism – locally rarest chunks are downloaded first –, and optimisticunchoking– periodicallyselectarandompeer– aiming atconstantlyimprovingthedownloadbitrate[29].Thedistinction betweenseedersandleechersis used toestimate relative down- load times (through their ratio), and also to quickly assess the contentavailability4,whichisassuredifatleastasingleseederis present.Thestepsrequiredtodownloadagivencontent,depicted in Fig. 2, can be summarized as follows: (1) a peer willing to download a given content registers at the tracker and joins the swarm;(2)itrequestsalistofpeers(asubset)intheswarmfrom the tracker; (3) it selects peers from that list to obtain missing chunks inorderto complete itsdownload; (4)peers synchronize thechunkstheyownandmisstodetermineifanymissingchunks canbetransferred;(5)ifselectedpeersownmissingchunks,those chunks get transferred;(6) upon downloadcompletion, the peer notifiesthetrackertobeknownasaseederforthatcontent.
ThemainprivacyissuesoftraditionalP2Psystemsarethatthey publiclydisclose user content interests, andprovide a proof that theuserisabletoaccessacontentinpartorentirely;theformer disclosesan intention whilethe latterprovides a proof of its re- alization.Apeeronlydownloadsthecontents thatauserisinter- estedin,therefore,byregisteringatatrackerandjoiningswarms, theusercontentinterests arebeingpubliclydisclosed.Peers pro- vide a proof that the useris able to accessa content inpart or entirelybyadvertisingwhichchunkstheyown,andentirelybyno- tifyingthetrackerupondownloadcompletion.
ThemainlegalissueoftraditionalP2Psystemsisthat,unknow- inglyandunwillingly,a usermaybe heldliable forcopyrightin- fringement dueto illegal content download.If a content is pub- lishedwithamisleadingdescriptionoriftheresourceusedtoob- tainthehashofthecontentisinsidious,theuserwillonlybecome awareofthatfactafteraccessingthecontentdatainpartorfully.
Nevertheless,forcontentsthatcanbeaccessedinpartbeforefully downloadingthem,which isusually the caseofmultimedia con- tents,theusermaybe infringingcopyrightlawafterdownloading asingleora fewchunks.The copyrightinfringementcanbe triv-
4A content is available if it can be fully downloaded.
ially provedbecause peersadvertisewhichchunks they ownand notifythetrackerupondownloadcompletion.
4. Problemdefinition
Inthe context ofthiswork, wedefine privacy preservationas theconcealmentofusercontentinterests.Weaimatdevelopinga privacy-preservingP2Pfilesharingmodelthat:
Hidesusercontentinterests. Theusermustbeabletohidehiscon- tentinterestsfromanyparticipantinthesystem:trackers,regular peers,orgroupsofcolludingpeers.Protectionagainstexternalen- titiesmonitoringalltheuser’straffic,suchasISPsorgovernments, isoutofthescopeofthiswork.
Has no trustrequirements. The user must be ableto download a content without having to trust anyone. Therefore, the privacy- preservingP2Psystemmustenablecontentsharinginlargegroups ofuntrustedpeers(untrustedP2Pnetworks).
Prevents user liability. The usershallnot be subjectto anyliabil- ity as a result of legitimate usage of the privacy-preserving P2P system. The user shall be protectedagainst actions ofmisbehav- ingnodes,orfromthedownloadofcontentspublishedwithmis- leadingdescription (misleading contents),which maydrive users tounknowinglyandunwillinglydownloadunethicalorillegalcon- tents.
Enablestimelydownloads. Theusershouldbe abletodownloada contentinduetime.Theaveragedownloadbitratemustbewithin the same order of magnitudeof traditional P2P systems that do notpreservetheprivacyofusers.
5. Relatedwork
Severalprivacy-enhancingP2P systems havebeenproposed in the literature for a wide range of applications, providing differ- entdegreesofprivacytousersandemployingvarioustechniques, the majority of which provides privacy preservation through anonymity or through plausible deniability. Freenet [1] and Tor[3]are probablythemostprominentanonymity solutionsfor, respectively, anonymous content distribution networks and low- latency anonymity.Despitetheir merits, anonymity systems tend tointroducemoreoverhead,andtheirusersmaybeheldindirectly liable.Forinstance,Tordefaultstoapathlengthoffour(300%net- work overhead), which lowers the throughput and increases the average latency [30], anda Tor relay node (core router) mayre- laytraffic ofmisbehaving peers,whichmakesthelast oneonthe path(exitnode)toappeartobetheoriginatorofsuchtraffic.Sys- temssuchasOneSwarm[31],whichrelyontrustlinkstoprovide privacy,arenotconsideredgiventhattheyarenotsuitableforun- trustedP2Pnetworks.Herein,wedepicttheprivacy-enhancingP2P systems providing plausible deniability and designed specifically forP2Pfilesharing.
The description provided for each privacy-enhancing P2P sys- tem focus on the trust requirements, on the protection against bothpassiveandactiveattacksaimingatidentifyingusercontent interests,andonthepotentiallegalliabilityofusers.
BitBlender[30]providesplausibledeniabilitybyintroducingre- laypeersthatsimplyproxyrequestsonbehalfofotherpeers.Peers willing toact asrelay peers canregister ata centralnode called blender,and,oncerequested,willjoinaP2Pswarminaprobabilis- tic wayso that they cannot be distinguished fromregular peers.
The joining probability of relay peers is defined by the blender, when asking registered peers to join a P2P swarm, so that the
Table 1
Comparison of privacy-preserving P2P file sharing systems regarding trust requirements, protection against passive and active attacks aiming at identifying user content interests, and potential user legal liability.
Protects Against Attacks
Work Has No Trust Requirements Passive Active Prevents Legal Liability
BitBlender ✗ ✗ ✗
SwarmScreen ✗ ✗
The BitTorrent Anonymity Marketplace ✗
Petrocco et al. ✗ ✗
set of relay peers remains unknown while having the cardinal- ity requested by the tracker. It enables the identification of reg- ularandrelaypeersthroughdownloadprogresstracking,andpro- vides a trivial proof of content download because peers adver- tisewhat they download.Thereby,BitBlenderprovides protection against passive attacks, but it is vulnerable to active attacks us- ingdownloadprogresstracking.Itrequiresuserstotrustboththe tracker and the blender. Its users may be subject to legal liabil- ityfordownloadingmisleadingcontents,andforrelayingtrafficof misbehavingpeers.
SwarmScreen [9] provides plausible deniability by obscuring user content interests through content interest disguise.The de- vised scheme, whichconsists in “addinga small percentage(be- tween 25% and 50%) of additional random connections that are statisticallyindistinguishablefromnaturalones”,thwartsguilt-by- associationattacks,thatis,attacksinwhichtheusercontentinter- ests can be inferred with highcertainty just by classifyingpeers based on the behavior of the communities they participate in.
SwarmScreenpresentsnotrustrequirements,butitsattackmodel onlyconsiderspassiveattacks.Itisvulnerabletoactiveattacksbe- cause contents can be distinguished through download progress tracking. Peers communicate through direct links, butusers may besubjecttolegalliabilityduetomisleadingcontentdownload.
The BitTorrent Anonymity Marketplace [32] follows Swarm- Screen’s approach to provide plausible deniability. It does not present any trust requirements, and peers also communicate through direct links. However, in order to protect against both passive and active attacks, all contents are fully downloaded to makethemindistinguishable,giventhatanattackerisabletofully trackdownloadprogress:peersadvertisewhatthey ownormiss.
Theauthorsdefinek-anonymityastheprivacyprotectionlevelob- tainedfromfullydownloadingkcontents.Thus,sinceitintroduces highnetworkoverhead,it eitherpreventsdownloads fromtimely completingorconstrainstheachievablelevelofprivacyprotection.
Usersmaybesubjecttolegalliabilityduetothedownloadofmis- leadingcontents.
Petrocco et al. [33], following SwarmScreen’s approach, pro- posed a system that aims at hiding user content interests with- out compromising timely downloadcompletion. Their systemre- lies on private swarms, request relaying, caching, and partial ad- vertisement ofdownloaded chunks.As statedby theauthors,pri- vate swarms are required to ensure a good level of privacy, i.e., toavoidtheidentificationofusercontentintereststhroughdown- loadprogresstracking.Yet,toobtainthecredentialsneededtojoin a privateswarm, peersmusttrustone ormoreparticipants.Also, asonlyafractionofthechunks areadvertised,itisnotclearhow acontentsharingisbootstrappedwithfewseedersorhowrequest relayshould operateduringperiodsofcontentunavailability.This systemprovides protectionagainstpassiveandactive attacks,but itsusersmaybeheldlegallyliableduetorelaytraffic.
Table 1 summarizesand compares the privacy-preserving P2P filesharingsystemsdescribedabovetakingintoconsiderationthe trust requirements,theprivacyprotection againstpassiveandac- tive attacks, and the potential legal liability of users. BitBlender and SwarmScreenare unable to hide user contentinterests from
active attacks using download progress tracking. The BitTorrent Anonymity Marketplace andPetrocco et al.’s systemsare able to thwartbothpassiveandactiveattacks.BitTorrentAnonymityMar- ketplacerequirespeers tofullydownloadall contentsinorder to make them indistinguishable, and therefore introduces a consid- erablenetworkoverheadthatwilleitherprevent downloadsfrom timelycompletingorconstraintheachievablelevelofprivacypro- tection.Petroccoetal.’ssystemrequirespeerstotrustoneormore participantsinordertoreducenetworkoverheadby usingpartial advertisementofdownloadedchunks.
All solutions require peers to advertise, either fully or par- tially,what they ownormiss. Anattackermayexploit download progresstrackingtoobtaina proofthat theuserisabletoaccess a content either entirelyor in part, andto narrow or even void plausibledeniability. Forunethical orillegal contents that canbe accessedin partbeforefullydownloading them,which isusually thecase ofmultimedia contents, an attackermayobtain a proof ofsuch access.As so,theusermaybe held legallyliablefor,un- knowinglyandunwillingly,downloadingasingleorafewchunks, beitduetotraffic relayingorduetomisleadingcontentdescrip- tion. Thereby, a legitimate usage of these systems may hold the userliableforcopyrightinfringement.
6. MistrustfulP2Pmodel
In this section we describe the latest version of our model, whichisbuiltupontheassumptiondescribedinSection 2:legiti- mateusersshouldnotbesubjecttoanylegalliabilityifthesystem ismainlyusedtosharelegitcontentsandthereisnoaccesstothe contentdata(noproofofaccess).Wereferthereaderto[15]fora priorversion.
This section is structured as follows. Section 6.1 provides an overviewofourmodelanditsmainbuildingblocks– contentin- terestdisguiseandmistrustfulsharing– tobestdescribehowthe problemwe aim to solve is addressed. Section 6.2discusses the peerrolesandtheirsharingbehavior,thecontentsharingprocess, andtheroleofmistrustfulsharingonit.Theattackmodelconsid- eredisdefinedinSection6.3.Sections6.4through6.8characterize theinstantiationsofeachoneofthemechanismsusedontheeval- uationofourmodel,whosemain focusisonits feasibilityrather thanonitsoverallperformance.
6.1. Overview
TheMistrustfulP2P modelisbuiltontheconcept ofmistrust- ing all the entities participating in the P2P network, hence its name,andthereforeusers arenot requiredto establishanytrust linksinordertoparticipateinthecontentsharing.Itreliesontwo main building blocks – content interest disguise and mistrustful sharing– andaimsathidingusercontentinterests throughplau- sibledeniability,inuntrustedP2Pnetworks,whileovercomingthe mainlimitationsofakinP2Pfilesharingsystems.Theselimitations canbesummarizedasfollows:(1)peersare requiredtoadvertise what they download enabling passive attacks; (2) protection againstactive attacksis onlyachievedby introducingeithertrust
requirements or considerable network overhead; (3) the privacy protectionagainstboth passiveandactiveattacks isprobabilistic;
(4) legitimate users may be held legally liable for, unknowingly andunwillingly,downloadingorrelayingtrafficofillegalcontents.
Content interest disguise, as in other privacy-preserving P2P systems,hidesusercontentintereststhroughplausibledeniability.
The usercontent interests are hidden by downloading both con- tentsthat theuser isinterestedin (genuine)andadditional con- tents of no interest to the user (cover), as long as they cannot bedistinguished. Registeringata trackerandjoininga swarmno longerrepresentsinterestinthat content,andusercontentinter- estscannolongerbeidentifiedbymonitoringjustasmallfraction ofthenetwork[9].Acontentinterestdisguiseschemeselectsthe setofcovercontents,howmuchofeachonetodownload,andmay impose constraints to the content sharing in order to hide user contentinterests. The evaluationof the Mistrustful P2P model is centered on the feasibility of its novelty, the mistrustful sharing buildingblock,andtherebytheproposalofacontentinterestdis- guiseschemeisoutofthescopeofthiswork.
Themistrustfulsharingbuildingblockenablestheusertocon- figuretherequiredtrade-off betweenprivacyandperformanceby definingthe size c ofthelargest colluding groupto be protected against and the minimum amount m of chunks that are to be downloaded per covercontent (minimum network disguiseover- head),wherec≤m<kforanycontentpartitionedintokchunks sothatthereisnoproofoffullcontentdownload(proofofdown- load).Proof of access is then avoided by encoding contents in a waythatonlyenablesdecodingafterfulldownload.Themistrust- fulsharingbuildingblockiscomposedoftwocoremechanisms– erasurecoding and disclosureconstraint –, andthree supporting mechanisms– blockselection,requestbackoff,andpeerselection.
ItenablestheMistrustfulP2Pmodeltoovercomelimitations(1)to (4)asfollows.
Peersavoidadvertisingwhattheydownloadbyrequestingfrom otherpeersrandom chunks,thus defeatingpassive attacksofany size;theblockdownloadprocessandthemessagesexchangedare describedinSection6.2.Attackershavethentoengageinthecon- tent sharing because the chunks owned ormissing are only im- plicitlydisclosedtootherpeerswhilesharing.Activeattacks,ofa size up to c, are defeatedby constraining the amount ofchunks disclosedtoanysetofcpeerstobe,atmost,m:anattackerisnot ableto distinguish covercontents fromgenuine ones because at leastmchunks aredownloaded ofeach content.As so,usercon- tentinterestsaredeterministicallyhidden.Forlegitimateusers,le- galliabilityispreventedby notrequiringpeersto relaytrafficon behalf ofother peers, by neverfully downloading covercontents (theirdataisneveraccessible),andby avoidingbothproof ofac- cessandproof ofdownload foranyattack ofa size up to c.The protectionofusercontentinterests andtheuser liabilityare dis- cussedinmoredetailinSection7,whichalsodescribesthecoun- termeasuresemployedagainstcommonattacks.
The erasurecodingmechanismisthecoremechanismusedto enabletheprobability ofrandomly retrievinga chunk tobecome significant,andto enabledecoding onlyafterfulldownload.Con- sidering a content divided into k chunks, uniformly distributed acrossthenetwork,the probabilityofretrievingthe lastchunk is just1/k.The erasurecodingmechanismgenerates a setofnera- surecoded chunks (blocks),from a set of k chunks, so that any subset of k blocks enables to retrieve the content, where k= k(1+
(k))and
(k)istheerasurecodingoverhead.Ifthenblocks
arealsouniformlydistributedacross thenetwork, theprobability ofretrievingthelastrequiredblockincreasesto1−(k−1)/n.As anexample,fork=20,n=5·k=100,and
(k)=0(optimalera- surecode),theprobabilityincreasesfrom5%to81%.
The disclosure constraint mechanism is the core mechanism used to ensure that no more than m blocks are disclosed (re-
quested orshared) toany setof c peers (largest colluding group considered),andthuspreventsproofofdownload(m< k).Italso contributesto thereduction ofthe overheaddueto coverdown- loads because only m blocks need to be downloaded per cover contentinorderto avoidthe identificationofgenuine downloads amongcoverdownloads.Thismechanismenablesthedisclosureof atleastoneblocktoeachpeer(m≥c),and,onaverage,m/cblocks canberequestedfromeachpeer.Reducingtheamountofavailable block requestsbyeitherincreasingc ordecreasingmmayimpact theoverallperformance.
The remaining three mechanisms – block selection, request backoff, and peer selection – are defined to enable the evalua- tion ofourmodel,andto ensure thatcontents are timely down- loaded.Theblock selectionmechanismdetermines whichblockis tobeoffered toagivenrequestingpeer,affectingthedistribution ofblocksamongpeersandthereforetheprobabilityofretrievinga usefulblock(innovativeblock).Therequestbackoff mechanismde- terminesthe delaybetweenblockrequests aimingatmaximizing theamount ofusefulblocks thatcan be obtainedfromthe avail- ableblockrequestsintheshortesttimeframe.Withthesameaim, thepeerselectionmechanismselectsa peertowhichablock re- questwillbesent.
The instantiation provided for each mechanism, described in Sections 6.4 through 6.8, is the one used to evaluate our model inSection 8.Due to thelarge numberofvariables andfactorsat play,weconsidertheimpactofthosethatareexpectedtochange more often – c andm, content size, peerarrival rate, and num- ber of seeders–, and ofcover downloads on the average down- loadbitrate andon the download completion ratio.For the sake ofclarityandtractability,otherfactorsandvariablessuchasinter- contentrelations,incentivestoshare,parallel chunkrequests,and Internetconnection heterogeneityarenot considered.Theevalua- tionaimsatdemonstratingthefeasibilityofourmodelratherthan atoptimizingitsoverallperformance,i.e.,itaimsatdemonstrating thatpeersareabletotimelydownloadcontentswithoutadvertis- ingwhattheydownload.
In sum, the mistrustful sharing building block reinforces the content interest disguise because the distinction between gen- uineandcovercontents ishardenedbynotdisclosingwhatpeers downloadormiss,andalargersetofcovercontentscan beused astheydonotneedtobefullydownloaded.Itpreventslegitimate usersfrombeingheldliable dueto covercontentandmisleading contentdownloads.Cover contentsarenever fullydownloaded to guaranteethattheuserhasneveraccesstotheircontentdata.Mis- leadingcontentsmaybefullydownloaded,butthereisnoproofof accessorproofofdownloadforanyattackofasizeuptoc. 6.2. Peerrolesandcontentsharing
Peers,percontent,cantakeoneoftworolesdependingontheir privacyrequirementsandthewaytheycontributetothefileshar- ing:seeder– peerhavingacontentthatwantstoshare,andwilling to forgo its privacy –, orcommoner – peerwilling to participate in the content sharing ifits privacy requirements can be met.A seedermaybetheauthororapartyinterestedinpublishingacon- tent,andthereforedoesnotrequiretheconcealmentofitscontent interests. It generates a unique block (erasure coded chunk) for each requestit receives, andonly refusesto serve block requests ifit has no resources available. Onthe other hand,a commoner doesnotgeneratenewblocks,onlysharesthemifitsinterestsre- main hidden, and onlyhas accessto the content data afterfully downloadingthecontent.Acommonerkeepstrackoftheblocksit shareswithotherpeersbothforprivacyprotectionandtoavoidof- feringablocktwicetothesamepeer.Itmayrefusetoserveblock requestsifithasnousefulblockstooffer,duetoresourceorpri- vacyconstraints,orduetocontentinterestdisguisestrategies.The
SEEDER COMMONER
Owns
(all chunks) (downloaded blocks)
Shares
(generated blocks) (downloaded blocks)
Downloads (does not download)
(enough useful blocks)
…
…
Owned data chunk Missing block (erasure)
Block (erasure coded chunk)
Fig. 3. Peer roles and sharing behavior on the Mistrustful P2P model. A seeder has all data chunks and generates a new block (erasure coded chunk) for each incoming request; a commoner is a peer that may be still downloading enough blocks to reconstruct the content data and shares the blocks it has already downloaded. As so, a commoner never shares data chunks and only has access to the content data after fully downloading the content.
r e d i v o r P r
e t s e u q e
RPeerA) (PeerB) (
Request(any)
−−−−−−−−−−−→
Offer(x)
←−−−−−−−− (if not refused) Accept(x)or Cancel(x)
−−−−−−−−−−−−−−−−−−−→
T ransfer(x)
←−−−−−−−−−− (on acceptance) Fig. 4. Messages exchanged during block download process. Peer A requests a ran- dom block from peer B, which either refuses the request or offers a block with id x . If the block is accepted by peer A, it will get transferred; otherwise, no transfer occurs to avoid unnecessary network overhead.
commoner neverdiscloses thereasonbehindrefusal. Fig.3 sum- marizesthepeerrolesandtheirsharingbehavior.
Theblock downloadprocessontheMistrustfulP2P modeldif- fers considerablyfromtheone onother P2P filesharingsystems, giventhat peers donot advertisewhat they download.This pro- cess is summarizedin Fig. 4,where peer Ais the requester and peer B is the provider, and works as follows. Peer A requests a random block from peer B, i.e., without providing the id of an intended block. Peer B then either refuses therequest by simply ignoring it, andthus not disclosingthe reasonbehind refusal, or replieswiththeidoftheblockitiswillingtoshare,whichmustbe differentfromanyotherthat mayhavebeenpreviouslydisclosed betweenthem (both asrequester andasprovider). Ifpeer Bhas offeredablock,peerAsendsareplymessageeitheracceptingthe offeredblock,inordertostartitsdownload,orcancelingtheblock request, in orderto avoidunnecessary network overhead.Unless theblockrequesthasbeencanceled,peerBsendstheblockithas offered to peer A. The possible outcomes of a block request are furtherdiscussedinSection6.6.
The content download process of the Mistrustful P2P model, depicted in Fig. 5, consists in the following steps: (1) the peer registers at the tracker to join the swarms of both genuine and cover contents in order to disguise the content interests of the user, without disclosingits role; (2)it requests a listof peers (a subset)intheswarmfromthetracker,whichisnotawareofeach peer’s role;(3) it selects peersfrom that list that cope withthe userprivacy requirements(eligiblepeers);(4)itrequests random blocksfromthosepeers inordertocompleteits download;(5)if
A
Tracker 1. Register
S C A
mrawS eht nioJ
SWARMX
reeP tceleS .3
S
C C
C C
C
SWARMC1 SWARMCZ
SWARMC0
S Seeder C Commoner Swarm Tracker
4. Request a Random Block
A Unregistered Peer A A Commoner A 5. Transfer Useful Block
Fig. 5. Overview of the content download process on the Mistrustful P2P model.
First, peer A registers at the tracker to join the swarm of content X (swarm X) without disclosing what blocks it owns or misses; it also joins swarms C 0through C zto disguise user content interests. Then, it requests a list of peers (a subset) in the swarm from the tracker. Until download completion, and as long as the privacy requirements are met, peer A selects eligible peers in the swarm and requests them random blocks. To prevent unnecessary network overhead, only offered blocks that are useful get transferred; otherwise, the block requests are canceled. The list of peers may be updated during content download, and there is no notification upon download completion.
A
S
reeP tceleS .3
C EC
DC RB
PS
DC
S Seeder C Commoner Swarm A Commoner A
EC Erasure Coding RBRequest Backoff BS Block Selection PS Peer Selection DCDisclosure Constraint
BS
EC
Data chunk Block Missing block 4. Request a Random Block
5. Transfer Useful Block
Fig. 6. Mistrustful P2P model mechanisms and their role in the content sharing.
The erasure coding mechanism is used by a seeder to generate new blocks, and by a commoner to retrieve the content data after fully downloading a content. The dis- closure constraint mechanism determines if a block request can be sent to a peer or accepted by a given commoner. The block selection mechanism determines which block is to be shared, if any; the same block is never offered twice to the same peer.
The peer selection mechanism selects an eligible peer to which a block request can be sent. The request backoff mechanism defines the delay between block requests.
theselectedpeersofferusefulblocks,thoseblocksgettransferred;
otherwise,the requests are canceled andno transfer occurs. The listofpeersmaybeupdated duringcontentdownload,andsteps (3),(4)and(5)are repeateduntil downloadcompletion (genuine contents)or untilat leastm blocks have beentransferred (cover contents).Thereisnonotificationfromthecommonerupondown- loadcompletion.
Thescopeofactionofthemistrustfulsharingmechanismsen- tailsmostlysteps(3),(4)and(5).Theirroleonthe contentshar- ingisillustratedinFig.6.Theerasurecodingmechanismenablesa commonertoretrievethecontentdataafterfullydownloadingthe codedcontent, andenablesa seeder togeneratea newblock for
each incoming request. The disclosure constraintmechanism de- terminesifa block request canbe sent to a peeroracceptedby agivencommoner.The blockselectionmechanismisusedbythe contactedcommonertodeterminewhichblock istobeofferedto therequestingpeer,ifany.Therequestbackoff mechanismdeter- minesthedelaybetweenblockrequestsofacommoner.Thepeer selectionmechanismselectsthepeertowhichtheblockrequestis sent.
6.3.Attackmodel
We assume that an attackermight be anyentity that partici- pates inthe system – apublisher, a tracker, a regular peer, ora groupofcolluding peers.Externalentitiesmonitoringalltraffic of apeer,such asISPs orgovernments,are out ofthe scope ofthis work.
Beingaparticipantofthesystem,weconsiderthatanattacker isable to engage inthe content sharing as a commoneror asa seeder, to be a tracker, and to publish contents with misleading description.Also, we consider that an attacker may coordinatea largenumberofpeersthat colludewitheach other(collusion at- tack) orassume multiple pseudonymousidentities (Sybil attack), whichisequivalenttoalargercolludinggroup.
The creation of large number of pseudonymous identities (Sybil attack) was first considered by Douceur [34], and is one of the most dangerous attacks that plague P2P networks [35]. Douceur[34] showed that, without trusted identity certification, Sybil attacks are always possible when considering realistic sce- narios.In untrusted P2P networks, unlike collusion attacks,Sybil attacks can be mitigated without explicit information either by performing resource testing or by applying recurring costs and fees[35].We referthereaderto[36] fora surveyonexistingap- proaches.
In thecontext ofour work,collusion andSybilattacks aimat increasingtheamountofblocksdisclosedbyotherpeerstoasin- gle entity or to colluding entities, so that either content down- loadcan be proven or usercontent interests can be determined.
TheMistrustfulP2Pmodelprovidesprivacyprotectionagainstsuch attacksof a size up to c peers (colluding group),be it colluding peers,Sybilpeers,oracombinationofboth.Itrequiresc≤m<k, andtherefore mandk areupperboundsforc.Ontheone hand, despiteincreasingtheminimumnetworkdisguiseoverhead,mcan beincreasedasneeded(uptok)giventhatitisuserconfigurable.
Ontheotherhand,kisdefinedbythepublisherofthecontentand cannotbechanged.Thereby,inordertoextendtheprivacyprotec- tionwithoutincreasingthesizeofthelargestcolludinggroupcon- sidered(c),theusermayconfigureassingleentitiesallthesetsof peersthatheconsiders,orsuspects,tobecolludingortobeSybil peers.
Peers are identified by their public IP addresses because we considerthatIPaddressesprovideamoreflexibleresourcetesting thatcan be madehardertoacquire thanother resources such as humantime,networkbandwidth,computationalpowerorstorage capacity,whichcannotberelated.Forprivacyprotectionpurposes, asetofpeerswhoseIPaddressesareconsideredtoberelatedcan be treated asa singlepeer (IP address aggregation), or multiple peers sharing a single IP address can be treated individually by consideringthe (IP,port) pairasthe identifier(IP address multi- plexing).Theuser isthen abletoconfigure howIP addressesare treatedforprivacyprotectionpurposes,providingtheflexibilityto goaslowastreatingall(IP,port)pairsasuniqueidentities,e.g.all peersbehindNAT5,uptotreatingallIPaddressesofagivenentity, colludingentities,city,countryoranyothersetofIPaddressesasa
5Network Address Translation.
singlepeer.Thisalsoenablestheusertoconfigurethesetofrules that are applied to peers using anonymous systems such asTor, whichcanbeIPaddressaggregationrules,IPaddressmultiplexing rulesoranycombinationofboth.
6.4. Erasurecoding
An erasure code generates a set of n symbols from a set of k symbols, k < n, so that any subset of k(1+
(k)) is enough to reconstructtheoriginal information,where
(k) is theerasure
codingoverhead.Erasurecodes areusually classifiedaccordingto threeorthogonalproperties:systematicity,ratefixedness,andcod- ing overhead.An erasurecode issystematic iftheinput symbols areembeddedintooutputsymbols,andnon-systematicotherwise.
If n is static and needs to be known before encoding, the era- surecodeisfixed-rate.Ifncan bedynamically increasedandthe amount of symbols that can be generated does not impose any practicallimitation,theerasurecodeisrateless.Finally,anerasure code is said Maximum Distance Separable (MDS) if any k sym- bols out of n are enough to reconstruct the original information [
(k)=0],ornon-MDS ifadditionalsymbolsare required[
(k) >
0].Non-MDS erasurecodesreduce significantly theencoding and decodingtimecomplexityordersbyintroducingcodingoverhead.
The erasure coding mechanism supports both MDS and non- MDSrateless erasurecodes,butrequiresthe contentto beeither codedorencryptedinawaythatonlyenablesdecodingafterfull download.Forevaluationpurposes,weconsiderscenariosinwhich theMDSerasurecodes aremoresuitable, e.g.those inwhichthe networkisthemostconstrainedresource,andthuserasurecoding overheaddoesnotrepresentoneadditionalvariablethat needsto be considered [
(k)=0]. We refer the reader to [37] fora rate- lessMDSconstruction ofReed-Solomon codesthat we developed for our model. These erasure codes are defined over the finite field Fp2 with(nlogk) encoding andmin{nlogn,klog2k}decod- ing timecomplexities, wherepisa Mersenneprime (p=2q−1) andn≤2q+1.TheirperformancewasevaluatedoverF(231−1)2,son
≤232,anddoesnotimposeanyconstraintstothecontentsharing.
Theyalsoenabledecodingonlyafterfullcontentdownload.
6.5. Disclosureconstraint
Thedisclosureconstraintmechanismenablestheusertoconfig- ure,per content,the requiredtrade-off between privacyandper- formance by setting the size c of the largest colluding group to be protectedagainst,andthe minimumamountm ofblocks that needtobedownloadedpercovercontent.Thesizecofthelargest potentialattackerisdefinedbythenumberofuniquepeers(unre- latedpublicIPaddresses)thatarecontrolledbyasinglegroup,ei- therasingleattackeroragroupofcolludingattackers.Thismech- anism ensures that coverandgenuine content downloadscannot be distinguished by tracking their downloadprogress because at mostmblocksaredisclosedtoanysetofcpeers,andthatnoma- licious peercan prove that a userdownloaded a content orhad accesstoitsdata(m<k).
Finding the maximum intersection betweenthe set of blocks disclosedtoanysetofcpeersisanNP-hardproblem[38],thuswe devisedaconservativeyetefficientalgorithmtoevaluatedynami- callythenumberofblocksthatcanstillbesharedwithapeer.The algorithm is divided intotwo main functions: one to update the counterofblocksdisclosedtoapeer(Algorithm1– UpdateBlocks Disclosed),andthe othertodetermine thenumberofblocksthat can still bedisclosed toa peer(Algorithm2– Blocksto Disclose Left).ThevariablescommonersandblocksDisclosedarerespectively anarray sortedbythenumberofblocksdisclosed,andthemaxi- mumnumberofblocksdisclosedtoanysetofcpeers.
Algorithm1 UpdateBlocksDisclosed.
functionIncrementBlocksDisclosed(id) i←commoners.getIndex
(
id)
ifin
v
alidIndex(
i)
then New.commoners.push
(
id)
commoners.last.blks←1 i←commoners.getIndex(
id)
else Known.
commoners[i].blks←commoners[i].blks+1 j←i−1
while
v
alidIndex(
j)
do j≥0 blksI←commoners[i].blksblksJ←commoners[j].blks
if blksI>blksJthen Stillunsorted.
swap
(
commoners[i],commoners[j])
i← jj← j−1
else Sorted.
break end if end while endif
ifi<c then Changesontopcpeers.
blocksDisclosed←blocksDisclosed+1 endif
endfunction
Algorithm2 BlockstoDiscloseLeft.
functionBlocksDiscloseLeft(id)
ifc>mor m≥kthen Invalid.
return0 endif
top←min
(
c,commoners.length)
Toppeers.le ft←m−blocksDisclosed−
(
c−top)
i←commoners.getIndex(
id)
ifin
v
alidIndex(
i)
then New.if commoners.length≥cthen
le ft←le ft+commoners[c−1].blks else
le ft←le ft+1 end if
else Known.
if i≥cthen
le ft←le ft+commoners[c−1].blks le ft←le ft−commoners[i].blks end if
endif returnle ft endfunction
Algorithm1 receivesasinput theid ofthepeerto whichone additional block was disclosed. If none has yet been disclosed, a new entry is created; otherwise, the entry is updated and, if needed, some elementsare swappedto keep thearray sorted.In eachcase,iftheupdatedentryisononeofthetopcpositions,the maximumnumberofblocksdisclosedisupdated.Algorithm1has lineartimecomplexity.
Algorithm2alsoreceivesasinputtheidofthepeer.Ifthecon- figured privacyrequirementsare not met(invalid), noblockscan be disclosed toany peer.If they are met,left contains thenum- berofblocksthatcanstillbedisclosed,ensuringthatatleastone block can be disclosed to each one of the top c peers; atmost, m−(c−1)blockscan be disclosedtoa singlepeer. leftneeds to be updated ifthere arealready atleastc peers andthe peerre- ferredbyidisoutsideofthatset.Algorithm2runsinlogarithmic time.
6.6.Blockselection
The block selection mechanism is used by commoners to de- terminewhichblockistobeoffered toa requestingpeer.Itplays animportantroleonhowtheblocksendupdistributedacrossthe network,affectingtheprobabilityofpeersobtainingusefulblocks.
Thismechanismensuresthatnouploadedblockisofferedtwiceto thesamepeer, anddetermineswhen requestsshould be refused.
Therequestrefusalmaybeduetothelackofusefulblockstooffer, duetoresource orprivacy constraints,ordueto content interest disguisestrategies.
Withtheaimofbalancingthedistributionofblocksacrossthe network,eachpeerattributesaweighttoeachblockitowns. The weight wi of each block is updated according to the perception of the peer about its availability, which is based on the accep- tanceorcancellationoftheofferedblocks.Theblockstoofferare pickedthrough arandom weightedselection, andrecentlydown- loaded blocks start with a weight ws. When a block is offered, if it is accepted, the weight is updated using an additive factor, wλ;otherwise,theweightisupdatedusingamultiplicativefactor, wη.Therefore,toensurethattheweightdecreaseonacceptanceis nevergreater than the one oncancellation, the weight update is givenbyEq.(1).
wi=
max1,wi−wμ
, ifitisaccepted max
1,
wi wη
, otherwise wherewμ=min wi−
wiwη
,wλ
(1)Withthe MistrustfulP2P model there isno need tosuddenly terminate orremove downloads nor to stop sharing because the provided protection does not depend on the time a peer keeps sharing a content, as long as cover and genuine downloads are treatedthesameway.Toevaluateourmodel,weconsideredws= 100, wλ=5, and wη=2 with the goal of favoring more recent blocks.
6.7.Requestbackoff
Therequestbackoff mechanismdetermines thedelaybetween block requests to help maximizing the amount of useful blocks thatcanbeobtainedfromtheavailableblockrequestsintheshort- esttimeframe.Themechanismidentifiesthesetofpeerstowhich blockrequestscanbesent(eligiblepeers),anddeterminesforhow long no block requests should be sent. Therefore, asthe former is a direct result of individual peer behavior and the latter de- pends on the swarm behavior, we define the backoff time as a two-dimensionalvariable that hasper peerandper swarm com- ponents.Thepeerbackoff componentprovidesthedelaytoreturn apeertothe setofeligiblepeers; theswarm backoff component providesthedelayuntilthenextblockrequest.Theactualbackoff time israndomly generatedwithin theinterval [0,b], whereb is thecalculatedbackoff time.
A block request has five possible outcomes: (1) refusal – the requestisrefusedbythecontactedpeer;(2)cancellation– there- questiscanceledbytherequester(duplicateblock);(3)acceptance
OUTCOME Peer A (Requester)
Peer B (Provider) Refusal
Cancellaon
Acceptance
Interrupon
Disposal
X
X X
Request Offer
Cancel Transfer
Accept
X
Fig. 7. Possible outcomes of a block request: refusal – the request is refused by peer B (no offer); cancellation – the request is canceled by peer A (no transfer);
acceptance – the request is accepted and the block is transferred from peer B to peer A ; interruption – the request is accepted but the block transfer is interrupted;
disposal – no request is sent.
– the request is accepted anda block is downloaded; (4) inter- ruption– therequestisacceptedbutthedownloadisinterrupted;
(5)disposal – norequestissentduetothelackofeligible peers.
Refusaland disposal discloseno information about block owner- ship,butalltheothersdo.Cancellationandacceptancerevealthat both peers alreadyown that block; interruptionreveals that the contactedpeerownsthatblock.Fig.7illustrateseachpossibleout- come.
Consider uandvtobe respectivelythenumberofconsecutive refusedrequestsandthenumberofconsecutiverequeststhatwere eithercanceledorinterrupted,anduiandvithesamevariablesbut fora peeri; by consecutivewemean untilarequest isaccepted.
Let
μ
,α
,λ
,β
,η
,andτ
¯ berespectivelythemaximumbackoff time, thebaselinearbackoff time,thelinearbackoff timefactor,thebase exponentialbackoff time,theexponentialbackoff time factor,and the estimatedaverage block transfer time. The backoff time b is thengivenbyEq.(2).b=min
μ
,α
+λ
u+β η
v−1(2) Forblockrequeststhatdonotdiscloseblockinformation,u,itis appliedalinearincrease,
λ
;forblockrequests thatdiscloseblockinformation,v,itisappliedanexponentialincrease,
η
.Thebackofftime never exceeds
μ
.The peer andthe swarm components areexpressedusing Eq.(2),butthe values oftheir variables maybe different.Thus,thevariablesofeachcomponentaredistinguished byapsubscript(peer)andanssubscript(swarm).Fortheswarm component only, when there are no eligible peers
∀
i,bpi>0, bs=min
λ
p,minbpi
.
Forthe sakeofclarity andtractability,the evaluationassumes no simultaneous block requests, homogeneous Internet connec- tions,andasinglecontentdownload.Therefore,inordertoevalu- ateourmodel,we considered
α
p=100 ms,β
p=τ
¯/4,λ
p=τ
¯/10,η
p=2,andμ
p=kmτ¯c forthepeercomponent.Theperpeerback- off timeis a function of theblock transfer time, andis instanti-Table 2
Summary of countermeasures employed to prevent common attacks.
Attack Countermeasure(s)
DoS Blacklist misbehaving peers (at application level).
Sybil and Collusion
Avoid block advertisement; employ the disclosure constraint mechanism; treat multiple peers using related IP addresses as pseudonymous identities of a single entity.
Peer Selection Use multiple unrelated trackers for the same content.
Forgery and Repetition
Peers communicate through direct links.
Pollution Each block has a checksum signed by the publisher.
ated such that the block requests sent to the samepeer are, on average,50msapartandsuchthat,onaverage,m/cblockrequests canbesenttoeachpeerduringcontentdownload.m/crepresents the configured protectionand k
τ
¯ theminimum time required to downloadagivencontent.Fortheswarmcomponent, weconsid- eredα
s=0,β
s=τ
¯/4ρ
,λ
s=τ
¯/16ρ
,η
p=2,andμ
s=τ
¯,whereρ
isthenumberofactivepeers.Theswarmbackoff timeisafunction ofbothblocktransfertimeandthenumberofactivepeers,which canbeobtainedfromthetracker(s).Giventhattheevaluationas- sumesnosimultaneousblockrequests,thereisnominimumdelay requiredbetweenconsecutiverequests,whichcanbe,atmost,the timerequiredtodownloadasingleblock.
6.8. Peerselection
The peer selection mechanism aims at selecting the eligible peer that has thehighest probability of providing a usefulblock in lesstime. The set of eligible peers isconstrained both by the disclosureconstraintandrequestbackoff mechanisms.Theformer provides,foragivencontent,thesetofpeerstowhichnofurther block requests can be sent to; the latter provides, for the same content, theset ofpeers that are ineligible forthe moment,and whencanthenextblockrequestbesenttoeachoneofthem.
Forthesakeofclarityandtractability,peerswereselectedran- domlyinordertoavoidaddinganadditionalfactorintotheeval- uation. Non-uniform selection of peers is expected to impact on howthe blocksendup distributedacross thenetwork, especially whenconsideringheterogeneousInternet connections:peerswith more available bandwidth are able to replicate more rapidly the blockstheyown.
7. Securityanalysis
Thissectionprovides asecurityanalysisofthe MistrustfulP2P model.WestartbydescribingthecommonattacksinP2Psystems, andpresentthecountermeasuresemployed.Then, wediscussthe identification of user content interests, the proof of full content download,anduserlegalliability.
7.1. Commonattacksandcountermeasures
This section describes the common attacks in P2P file shar- ingandthecountermeasuresemployed,whicharesummarizedin Table 2, focusing on the context of our work. The attack model consideredassumesthat an attackercan beanyentityparticipat- inginthesystem– apublisher,atracker,aseederoracommoner – ora group ofthoseentities incollusion.Externalentities, such asISPsandgovernments,areoutofthescopeofthiswork.
DoS. A denial-of-service attack aims eitheratshutting down the entire systemor at making one or more contents unavailable. It maytargettrackerstopreventpeersfromknowingeachotherand thereforedisablecontentsharing,ortargetseederstopreventnew